perm filename NOTES.PAA[LSP,JRA]22 blob sn#249365 filedate 1976-11-30 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00019 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00003 00002	.SS(Becoming an expert,,P120:)
C00015 00003	As you most likely have discovered, the real sticky business in LISP
C00031 00004	It %2is%* possible to write a directly recursive reversing function
C00034 00005	.SEC(Applications of LISP,Applications)
C00043 00006	.SS(Examples of LISP applications,,P255:)
C00050 00007	.SS(Differentiation,differentiation,P41:)
C00061 00008	%1
C00066 00009	.<<abstracting from rep>>
C00078 00010	.SS(Data Bases,,P220:)
C00102 00011	.SS(Algebra of polynomials)
C00117 00012	.SS(Evaluation of polynomials,,P2:)
C00121 00013	.<<PROBLEMS ON POLY EVAL >>
C00123 00014	.CENT(More on polynomial evaluation)
C00150 00015	Let's go through a complete massaging of %aB%* of {yon(P124)}.
C00154 00016	.CENT(Time to take stock)
C00161 00017	.<<AND THE GREAT MOTHERS>>
C00162 00018	.SS(Another Respite)
C00170 00019	.<<PROVING PROPERTIES>>
C00177 ENDMK
C⊗;
.SS(Becoming an expert,,P120:)
.BEGIN TURN ON "←";
We have already traced the development of a few LISP algorithms, and we
have given a few hints for the budding LISPer. It is time to 
reinforce these tentative starts with an intensive study of the
techniques for writing good LISP programs.
This section will spend a good deal of
time showing different styles of definition, giving hints about how to
write LISP functions, and increasing your familiarity with LISP.
For those of you who are impatiently waiting to see some %2real%* applications
of this strange programming language, we can only say "be patient".
The next chapter %2will%* develop several non-trivial algorithms, but
what we must do first is improve your skills, even at the risk of
worsening your disposition.

First some terminology is appropriate:
the style of definition which we have been using is called
%2⊗→definition by recursion↔←%*. 
The basic components of such a definition are:

.BEGIN GROUP;
.BEGIN INDENT 10,10;
%21.%* A basis case: what to compute as value for the function in
one or more particularly simple cases. A basis case is frequently referred to
as a termination case.
.END
%aREC%*
.BEGIN INDENT 10,10;
%22.%* A general case: what to compute as value for a function, given
the values of one or more previous computations on that function.
.END
.END

You should compare the structure of a %aREC%*-definition of a function
with that of an %aIND%*-definition of a set (see %aIND%* on {yon(P117)}).
Applications of %aREC%*-definitions are particularly useful in computing
values of a function defined over a set which has been defined by an
%aIND%*-definition. 
For assume that we have defined a set %dA%* using %aIND%*  then a plausible
recipe for computing a 
function  %df%*  over %dA%* would involve two parts: first, tell how to compute
%df%* on the base domain of %dA%*, and second, given values for some elements
of %dA%* say %da%41%1,#...,%da%4n%1,
use %aIND%* to generate a new element %da%1; then specify the value of %df(a)%*
as a function of the known  values of %df(a%41%*)%1,#...,# %df(a%4n%*)%1.
That is exactly the structure of %aREC%*.

.P119:
.BEGIN GROUP;
Here is another attribute of %aIND%*-definitions: If we have defined a 
set %dA%* using %aIND%*, assume we wish to prove that a certain property 
%aP%* holds for every element of %dA%*. We need only show that:
.BEGIN INDENT 10,10;
%21.%*  %aP%* holds for every element of the base domain of %dA%*.
.END
%aPRF%*
.BEGIN INDENT 10,10;
%22.%* Using the
technique we elaborated in defining  the function %df%* above, if we
can show that %aP%* holds for the new element perhaps relying on proofs
of %aP%* for sub-elements, then we should have a convincing argument that
%aP%* holds over %2all%* of %dA%*.
.END
.END

This proof technique is a generalization of a common technique
for proving properties of the integers. In that context it is called
mathematical induction.

So we are seeing an interesting parallel between inductive definitions
of sets, recursive definitions of functions, and proofs by induction.
As we proceed we will exploit various aspects of this  interrelationship.
However our task at hand is more mundane: to develop facility at
applying %aREC%* to define functions over the %aIND%*-domains 
of symbolic expressions, %aS%*,  and of sequences, %aSeq%*.

First let's be reassured that the functions we have
constructed so far do indeed satisfy %aREC%*.
Recall our example of %3equal%* on {yon(P1)}. 
The basis case involves a calculation on members of %d<atom>%*;
there we rely on %3eq%* to distinguish between distinct atoms.
The question of equality for two non-atomic S-exprs was thrown back to the
question of equality for their %3car%*s and %3cdr%*s. But that too,
is the right thing to do since the constructed object is in fact
manufactured by %3cons%*, and %3car%* and %3cdr%* of that object
get the components.

Similar justification for %3length%* on {yon(P118)} can be given.
Here the domain is %aSeq%*. The base domain is the empty sequence, and
%3length%* is defined to give %30%* in that case. The general case in the
recursion comes from
the %aIND%*-definition of a sequence⊗↓Note ({yon(P114)}) that we didn't give
an explicit %aIND%*-definition, but rather a set of BNF equations.
The reader can easily supply the explict definition.←.
There, given a sequence %2s%*, we made a new sequence by adding a sequence element
to the the front of %2s%*. Again the computation of %3length%* parallels
this construction, saying that the length of this new sequence is
one more than the length of %2s%*.

For a more traditional  example consider the factorial function, n!.

%21.%*  The function is defined for non-negative integers.

%22.%*  The value of the function for 0 is 1.

%23%*.  Otherwise the value of n! is n times the value of (n-1)!.

It should now be clear how to write a LISP program for the factorial function:

.BEGIN CENTERIT;SELECT 3;
.P20:
←fact[n] <= [eq[n;0] → 1; %et%* → times[n;fact[sub1[n]]]] ⊗↓%3times%1 is assumed
to be a LISP function  which performs multiplication, and
 %3sub1[n]#<=#n-1%1.←

.END
The implication here is that it is somehow easier to compute (n-1)! than
to compute n!. But that too is in accord with our construction of 
the integers using the successor function.

These examples are typical of LISP's recursive definitions.
The body of the definition is a conditional expression; the first few
branches involve  special cases, called %2⊗→termination conditions↔←%*.
Then  the remainder of the conditional covers the general case-- what to
do if the argument to the function is not one of the special cases.

Notice that %3fact%* is a partial function, defined only for non-negative
integers. 
When writing or reading LISP definitions pay particular attention to the
domain of definition and the range of values produced. 
The following general hints should also be useful:

%21%*.  Is it a function or predicate? This information can be used to
double-check uses of  the definition. Don't use a predicate where a 
S-expr-valued function is expected; and don't use an S-expr-valued function
where a list-value is expected.

%22%*.  Are there any restrictions on the argument positions? Similar
consistency checking as in %21.%* can be done with this information.

%23%*.  Are the termination conditions compatible with the restrictions
on the arguments?  If it is a recursion on lists, check for the empty
list; if it is a recursion of arbitrary S-exprs, then check for the
appearance of an atom.

%24%*.  Whenever a function call is made within the definition, are all
the restrictions on that function satisfied?

%25%*.  Don't try to do too much. Be lazy. There is usually a very simple
termination case. 
If the termination case looks messy, there is probably something wrong with your
conception of the program.
If the general case looks messy, then write some  subfunctions
to perform the brunt of the calculation. 

Use the above  suggestions when writing any subfunction. When you are
finished, no function will do very much, but the net effect of all the
functions acting in concert is a solution to your problem. That is
part of the mystery of recursive programming.


As you most likely have discovered, the real sticky business in LISP
programming is writing  your own programs. But who says programming
is easy? LISP at least makes some of your decisions easy. Its
structure is particularly spartan. So far there is only %2one%* way to write
a non-trivial algorithm in LISP: use recursion. The structure of the program
goes like that of an inductive argument. Find the right induction hypothesis
and the inductive proof is easy; find the right structure to recur on
and recursive programming is easy. It's easier to begin with unary functions;
then there's  no question about what to recur on. The only decision now
is how to terminate the recursion.  If the argument is an S-expr
we typically terminate on the occurrence of an atom. If the argument is a
list then terminate on %3()%*. If the argument is a number then terminate on zero.

First let's  consider a slightly more complicated arithmetical example, the 
⊗→Fibonacci sequence↔←: 0, 1, 1, 2, 3, 5, 8, ... . This sequence is frequently
characterized by the following recurrence relation:

.BEGIN CENTERIT;
.GROUP
←f(0) = 0
←f(1) = 1
←f(n) = f(n-1)+f(n-2);
.APART
.END

.GROUP
The translation to a LISP function is easy:

.BEGIN TABIT2(16,26);SELECT 3;GROUP;
\fib[n] <=\[eq[n;0] → 0;
\\ eq[n;1] → 1;
\\ %et%* → plus[fib[sub1[n]];fib[sub1[sub1[n]]]]]
.END

where %3plus%* is a representation of the mathematical function, %3+%*.
.APART

A few points can be made here. First, notice that the intuitive evaluation
scheme requires many duplications of computation.  For example,
computation of %3fib[5]%* requires the computation of %3fib[4]%* and %3fib[3]%*.
But within the calculation of %3fib[4]%* we must again calculate %3fib[3]%*,
etc.  It would be nice if we could restructure the definition of the function
%3fib%* to stop this duplication of calculation. Since we %2do%* wish to run
programs on a machine we should give some attention to efficiency.
To those with programming experience, the solution is easy: assign the
partial computations to temporary variables.  The problem here is that
our current subset of LISP doesn't contain assignment. There is, however,
a very useful technique which we can apply.

We will define another function, called %3fib%9'%1, on three variables %3x%*,
%3y%*, and %3n%*. The variables, %3x%* and %3y%*, will be used to carry
the partial computations. Consider:

.BEGIN TABIT2(17,34);SELECT 3;CENTERIT;
.GROUP
←fib%41%*[n] <= fib%9'%*[n;0;1]

\fib%9'%*[n;x;y] <=\[eq[n;0] → x;
\\ %et%* → fib%9'%*[sub1[n];plus[x;y];x]]
.APART
.END

This example is complicated enough to warrant examination. The initial call,
%3fib%41%*[n]%1, has the effect of calling %3fib%9'%1 with %3x%* initialized to
%30%* and with %3y%* initialized to %31%*. The calls on %3fib%9'%1 within the body
of the definition, say the i%8th%* such recursive call, has the effect
of saving the i%8th%* Fibonacci number in %3x%* and the i-1%8st%* in %3y%*.
For example:

.BEGIN TABIT2(25,37);SELECT 3;GROUP;
\fib%41%*[4] \= fib%9'%*[4;0;1]
\\= fib%9'%*[3;1;0]
\\= fib%9'%*[2;1;1]
\\= fib%9'%*[1;2;1]
\\= fib%9'%*[0;3;2]
\\= 3
.END

This same trick of using ⊗→auxiliary function↔←s can be applied to the
factorial example. When viewed computationally, the resulting definition 
will be more efficient, though
the gain in efficiency is not as apparent  as that in the Fibonacci
example ⊗↓The %3fib%41%1 example improves efficiency mostly 
by calculating fewer intermediate results.
The gain in the %3fact%41%1 example is involved with the machinery
necessary to actually execute the program: the run-time
environment, if you wish. We will discuss this when we talk about
implementation of LISP in {yonsec(P107)}. The whole question of: "what is
efficient?" is open to discussion.←. Thus:

.BEGIN TABIT1(10);SELECT 3;GROUP;CENTERIT;
.P21:
←fact%41%*[n] <= fact%9'%*[n;1];

←fact%9'%*[n;x] <= [eq[n;0] → x; %et%* → fact%9'%*[sub1[n];times[n;x]]]
.APART
.END

It is clear in these examples that the functions %3fact, fact%41%1 and
%3fib, fib%41%1 are equivalent. Perhaps we should prove that this is so.
We presented the  crucial ideas for the proof in the discussion on {yon(P119)}
concerning %aIND%*, %aREC%* and %aPRF%*.
We shall examine the question of proofs of equivalence in {yonss(P15)}.

The trick of auxiliary functions is clearly applicable to LISP functions
defined over S-exprs:

.BEGIN CENTERIT;SELECT 3;group;
.P19:
←length[n] <= [null[n] → 0; %et%* → add1[length[rest[n]]]] ⊗↓add1[x] <= x+1←

←length%41%*[n] <= length%9'%*[n;0]

←length%9'%*[n;x] <= [null[n] → x; %et%* → length%9'%*[rest[n];add1[x]]]
%1and it seems apparent that %3length[n]%1 is equivalent to %3length%41%*[n]%1.
.APART
.END

So far our examples have either been numerical or have been predicates.
Predicates only require traversing existing S-exprs; certainly  we will
want to write algorithms to build new S-exprs.  Consider the problem
of writing a LISP algorithm to  reverse a list %3x%*. The intuitive computation
is quite simple. Take elements off of %3x%* one at a time and put them
onto a new list %3y%*; as initialization, %3y%* should be  %3()%* and
the process should terminate when %3x%* is empty. Thus:

.BEGIN SELECT 3; TABIT2(32,49);

\x\y
\(A B C D)\( )
\(B C D)\(A)
\(C D)\(B A)
\(D)\(C B A)
\( )\(D C B A)
.END

Here's a plausible %3reverse%*; notice that we use a sub-function
%3rev%9'%1 to do the hard work and sneak the initialization in on
the second argument to %3rev%9'%1.

.BEGIN CENTERIT;SELECT 3;GROUP;
.P97:

←reverse[x] <= rev%9'%*[x;( )]

←rev%9'%*[x;y] <= [null[x] → y; %et%* → rev%9'%*[rest[x];concat[first[x];y]]]
.APART
.END

The function  ⊗→%3reverse%*↔←  builds a list which is the reverse of its
argument.  Notice that this definition uses an auxiliary function.  
.P16:
Sometimes it is more natural to express algorithms this way. We will
see a "direct" definition of the reversing function in a moment.

This %3reverse%* function builds up the new list 
in a very straightforward mannner, %3concat%*-ing
the elements onto the second argument of %3rev%9'%*%1. Since %3y%* was initialized
to %3(#)%* we are assured that the resulting construct will be a list.

Construction is usually not quite so straightforward. Suppose we wish to
define a LISP function named ⊗→%3append%*↔← of two list arguments, %3x%* and %3y%*,
which is to return a new list which has %3x%* appended onto the front of %3y%*.
For example:

.BEGIN CENTERIT;SELECT 3; GROUP;

←append[(A B D);(C E)] = (A B D C E)
←append[A;(B C)] %1 is undefined%*. %3A%1 is not a list.
←%3append[(A B C);NIL] = append[NIL;(A B C)] = (A B C)
.APART
.END

So %3append%* is a partial function. It should be defined by recursion,
but recursion on which argument?  Well, if either argument is %3(#)%*
then the value given by %3append%* is the other argument. 
The next simplest case is a one-element
list; if exactly one of %3x%* or %3y%* is a singleton how does that help us
discover the recurrence relation for appending? It doesn't help much if %3y%*
is a singleton; but if %3x%* is, then %3append%* could give:

←%3concat[first[x];y]%* as result.

So recursion on %3x%* is likely. The definition follows easily now.

.P22:
←%3append[x;y] <= [null[x] → y; %et%* → concat[first[x];append[rest[x];y]]].%1

Notice that the construction of the result is a bit more obscure than
that involved in %3reverse%*. The construction has to "wait" until
we have seen the end of the list %3x%*. For example:

.BEGIN TABIT2(10,40);SELECT 3;GROUP;
\append[(A B C);(D E F)] \= concat[A;append[(B C);(D E F)]]
\\= concat[A;concat[B;append[(C);(D E F)]]]
\\= concat[A;concat[B;concat[C;append[( );(D E F)]]]]
\\= concat[A;concat[B;concat[C;(D E F)]]]
\\= concat[A;concat[B;(C D E F)]]
\\= concat[A;(B C D E F)]
\\= (A B C D E F)
.APART
.END

We are assured  of constructing a list here because %3y%* is a list
and we are %3concat%*-ing onto the front of it. LISP functions
which are to construct list output by %3concat%*-ing %2must%*  %3concat%* onto  
the front of an existing %2list%*. That list may be either
non-empty or the empty list, %3(#)%*. 
This is why the termination condition on a list-constructing function,
such as the following function, %3dotem%*,
returns %3(#)%*.

The arguments to %3dotem%* are both lists assumed to contain the same
number of elements. The value returned is to be a list of dotted pairs; the 
elements of the pairs are the corresponding elements of the input lists. Thus:

.BEGIN SELECT 3;TABIT1(13);

dotem[x;y] <= [\null[x] → ( );
\%et%* → concat[cons[first[x];first[y]];dotem[rest[x];rest[y]]]]

.END
The first thing to note is the use of both %3concat%* and %3cons%*: %3concat%*
is used to build the final list output; %3cons%* is used to build the
dotted pairs. Now if we  had written %3dotem%* such that it knew
about our representation of lists, then %2both%* functions would have been
%3cons%*. The definition would not have been quite as clear.
Look at a computation as simple as %3dotem[(A);(B)]%*. This will involve

.BEGIN CENTERIT;SELECT 3;

←concat[cons[A;B];dotem[( );( )]]

%1Now the evaluation of %3dotem[( );( )]%1 returns our needed %3( )%*, giving

←%3concat[cons[A;B];( )] = concat[(A . B);( )] = ((A . B))
.END

If the termination condition of %3dotem%* returned anything other than
%3(#)%* then the list-construction would "get off on the wrong foot"
and would not generate a true list.

As promised on {yon(P16)}, here is a "direct" definition of %3reverse%*.

.BEGIN SELECT 3;TABIT1(13);GROUP;

.P23:
reverse[x] <=\[null[x] → ( );
\ %et%* → append[reverse[rest[x]];concat[first[x];( )]]] 
.END

This reversing function is not as efficient as the previous one.
Within the construction of the reversed list the %3append%* function
is called repeatedly. You should evaluate something like %3reverse[(A B C D)]%*
to see the gross inefficiency.
.END

It %2is%* possible to write a directly recursive reversing function
with no auxiliary functions, no functions other than the primitives, and
no efficiency. We shall do so simply because it is a good example of
the process of discovering the general case of the recursion by careful
consideration of examples. Let us call the function %3rev%*.

Let's worry about the termination conditions later. Consider, for example,
%3rev[(A#B#C#D)]%*. This should evaluate to %3(D#C#B#A)%*. How can we
construct this list by recursive calls on %3rev%*? 
In the following, assume %3x%* has value %3(A#B#C#D)%*.
Now note that %3(D#C#B#A)%* is the
value of %3concat[D;(C#B#A)]%*. Then %3D%* is %3first[rev[rest[x]]]%* (it is also
%3first[rev[x]]%* but that would not help us).
How can we get %3(C#B#A)%*?## Well:
.BEGIN TABIT2(21,30);GROUP;SELECT 3;

\(C B A)\= rev[(A B C)]
\\= rev[concat[A;(B C)]]   %1(we are going after %3rest[x]%* again)%3
\\                        %1but first  we can get %3A%* from %3x.
\\= rev[concat[first[x];(B C)]]
\\= rev[concat[first[x];rev[(C B)]]]
\\= rev[concat[first[x];rev[rest[(D C B)]]]]
\\= rev[concat[first[x];rev[rest[rev[rest[x]]]]]]
.END

.BEGIN SELECT 3;CENTERIT;

%1Finally%*
←rev[x] = concat[first[rev[rest[x]]];rev[concat[first[x];rev[rest[rev[rest[x]]]]]]]
.END

%1
The termination conditions are simple. First %3rev[(#)]%* gives %3(#)%*.
Then notice that the general case which we just constructed has %2two%*
%3concat%*s.  That means the shortest list which it can make is of length two.
So lists of length one are handled separately: the reverse of such a list
is itself.
Thus the complete definition should be:

.BEGIN SELECT 3;GROUP;TABIT1(10);

rev[x] <= [\null[x] → ( );
\null[rest[x]] → x;
\%et%* → concat[first[rev[rest[x]]];rev[concat[first[x];rev[rest[rev[rest[x]]]]]]] ]
.END

.REQUIRE "PROB5.PUB" SOURCE_FILE;

.SEC(Applications of LISP,Applications)
.BEGIN INDENT 20,20;
"...All the time I design programs for nonexisting machines and add:
`if we now had a machine comprising the primitives here assumed, then the job is
done.'

 ... In actual practice, of course, this ideal machine will turn out not to exist, so
our next task --structurally similar to the original one-- is to program the
simulation of the "upper" machine.... But this bunch of programs is written
for a machine that in all probability will not exist, so our next job will be
to simulate it in terms of programs for a next lower level machine, etc.,
until finally we have a program that can be executed by our hardware...."
.END
.BEGIN TURN ON "→";

→E. W. Dijkstra, %3Notes on Structured Programming%1
.END

.SS(Introduction)
There are at least two ways of interpreting this remark of Dijkstra.
At the immediate level we note that anyone who has programmed at a level
higher than machine language has experienced the phenomenon. For the
natural interpretation of
programming in a high-level language is that of %2writing%* algorithms for
the non-existing high-level machine. Typically, however the changes of 
representation from machine to machine are all done automatically: from
high-level, to assembly language, and finally to hardware instructions.

The more fruitful view of Dijkstra's remark is related to our
discussions of abstract data structures and algorithms. In this view
we express our algorithms and data structures in terms of abstractions
independent of how they may be represented in a machine; indeed we can
use the ideas of abstraction %2regardless%* of whether the formalism
will find a representation on a machine. This use of abstraction is the
true sense of the programming style called "structured programming".
We will see in this chapter how this programming style is a natural 
result of writing representation-independent LISP programs.

As we have previously remarked, we will see a close relationship
between the structure of an algorithm and the structure of the data.
We have seen this already on a small scale: list-algorithms recur "linearly"
on %3rest%* to %3(#)%*; S-expr algorithms recur "left-and-right" on
%3car%* and %3cdr%* to an atom.
Indeed, the instances of control structures appearing in an algorithm 
typically parallel
the style of inductive definition of the data structure which the
algorithm is examining. If a structure is defined as:

.BEGIN CENTERIT;
←%eD%1 ::= %eD%41%1 | %eD%42%1 | %eD%43%1  
←e.g.  <seq elem> ::= <indiv> | <seq>
.END
then we can expect to find a conditional expression whose
predicate positions are filled by the recognizers for the
%eD%4i%1's. If the structure is defined as:
.BEGIN CENTERIT;
←%eD%1 ::= %eD%41%1 ...  %eD%41%1  
←e.g.   <seq> ::= %2(%*<seq elem>%2,%* ..., <seq elem>%2)%*
.END
that is, a homogeneous sequence of elements, then we will have a 
"linear" recursion like that experienced in list-algorithms⊗↓ Indeed
there are other forms of control like iteration or 
%3lit%* ({yon(P133)})
which are more natural for such data structures.←.
Finally if the structure is defined with a fixed number of components as:
.BEGIN CENTERIT;
←%eD%1 ::= %eD%41%1 %eD%42%1 %eD%43%1... %eD%4n%1
←e.g.     <sexpr> ::= %3(%*<sexpr> . <sexpr>%3)%*
.END
then we can expect a full recursion like that  of S-exprs⊗↓You should have
noticed that we are therefore dealing with essentially "context-free" abstract
data structures; i.e., those generated by context-free grammars.←.

Thus a data-structure algorithm tends to "pass off" its
work to subfunctions which will operate on the components of the data
structure. Thus if a structure of type %eD%1 is made up of components 
of types 
%eD%41%1,
%eD%42%1,
%eD%43%1, and 
%eD%44%1, then the structure of an algorithm %3f%* operating on %eD%*
typically involves calls on subfunctions %3f%41%1 through %3f%44%1 to
handle the subcomputations. Each %3f%4i%1 will in turn break up its %eD%4i%1.

Thus  the type-structure of the call on %3f%* would be: 

.BEGIN CENTERIT;

←%3f[%eD%3] = g[%3f%41%3[%eD%41%3];%3f%42%3[%eD%42%3];%3f%43%3[%eD%43%3];%3f%44%3[%eD%44%3]]
.END

This is the essence of level-wise programming: we write %3f, f%41%*,#...#,f%44%1
independently of the representation of their data structures.
%3f%* will run provided that the %3f%4i%1's are available. 
As we write the %3f%4i%1's we will probably invoke computations on
components of the corresponding %eD%4i%1. Those computations are in turn
executed by subfunctions which we have to write. This process of
elaboration terminates when all subfunctions are written and all
data structures have received concrete representations. In LISP this means
the lowest level functions are expressed in terms of LISP primitives
and the data structures are represented in terms of S-exprs. Thus at the highest
level we tend to think of a data structure as a class of behaviors; 
we don't care about the internal mechanisms which implement that behavior.
At the lowest
level, machine-language routines  simulate %2one%* of many possible representations.

This process of elaboration of abstract algorithm and abstract data structure
does not invalidate the top-level definition of %3f%*  it remains
intact.  We should note, however, that this style of programming is not a
panacea; it is no substitute for clear thinking.
It only helps control the complexity of the programming process. With this
in mind, here are some programming examples.
.SS(Examples of LISP applications,,P255:)
.BEGIN TURN ON "←";

The next few sections will examine some non-trivial problems
involving  computations on data structures.  We will describe
the problem intuitively, pick an initial representation for the
problem, write the LISP algorithm, and in some cases "tune" the
algorithm by picking "more efficient" data representations.

The examples share other important characteristics: 

%21.%*  We examine the problem domain and attempt to represent its elements 
as data structures.

%22.%*  We reflect on our (intuitive) algorithm and try to express it 
as a LISP-like data-structure manipulating function.


%23.%*  While performing %21%* and %22%*, we might have to modify some of
our decisions. Something  assumed to be structure might better be represented
as algorithm, or the converse might be the case.

%24.%*  When the  decisions are made, we evaluate the LISP function on a 
representation of a problem.

%25.%*  We reinterpret the data-structure output as an answer to our problem.

Pictorially in terms of LISP:

.BEGIN GROUP;TABIT1(35);
.P46:

intuitive algorithm => LISP function\|
\|  evaluation
\|==============> interpret S-expr output as answer
\|
problem domain      => S-expressions\|

.END

Whenever we write computer programs,  whatever language we use,
we always go through a similar representation problem. 
The process is more apparent in a higher-level language like 
FORTRAN or ALGOL, and is most noticeable in a language like LISP which
primarily deals with data structures.

When we deal with numerical algorithms, the representation problem
has usually been settled in the transformation from real-world situation
to a numerical problem. One has to think more explicitly about
representation when we deal with structures like arrays or matrices. 
We are encoding
our information in the array. But the preceding diagram occurs 
within the machine, even for strictly non-structured
numerical calculation.

.BEGIN GROUP;TABIT1(40);

numerical algorithm => machine instructions\|
\|  execution
\|==========> interpret binary number as answer
\|
numbers => binary representation\|

.END

The encodings are done by the input routines. The result of the execution is
presented to the external world by the output routines.

However, it is not until we come to data-structure computations, or
non-numerical computations, that the representation problem really
becomes undeniable.  We have to think more about what 
we are doing since we lack certain preconceptions or intuitions about
such computations. More importantly, however, we are trying to represent
actual problems %2directly%* as machine problems. We do not attempt to
first analyze them into a complex mathematical theory, but should
try to express our intuitive theory directly as data-structure computation.
This is a different kind of thinking, due wholly to the advent of 
computers.  Indeed the field of computation has expanded so much
as to obsolete the term  "computer". "Structure processor" is more
indicative of the proper level at which we should view "computers".

We have already seen a simple example of the representation  problem
in the discussion of list-notation beginning in {yonss(P100)}.


.BEGIN GROUP;TABIT1(35);

list algorithm => LISP function\|
\|  evaluation
\|============> re-interpret S-expr result as list-output.
\|
list expression => S-expression\|

.END

The following sections deal with representation of complex data structure problems
in LISP. 
.SS(Differentiation,differentiation,P41:)

This example will describe a rudimentary differentiation
routine for polynomials in several variables.  
We will develop this algorithm through several stages. We will begin
by doing a very direct, but representation-dependent, implementation.
We will encode polynomials as special LISP lists and will express the 
differentiation
algorithm as  a LISP program operating on that representation.
When this program is completely specified we will then scrutinize it,
attempting to see just how much of the program and data structure
is representation and how much is essential to the expression of the
algorithm.

You should recognize two facts about the differentiation algorithm for polynomials:
First, the algorithm operates on forms (or expressions) 
as arguments and returns forms
as values. Typically we think of the arguments and values to functions
as being simple values, or the %2results%1 of evaluating complex forms,
 not forms themselves.
Second, and more important, 
you should realize that the  algorithm for differentiation is
recursive!  The question of differentiation of a sum is thrown
back on the differentiation of each summand.  Similar relationships 
hold for products, differences, and powers.  As with all
good recursive definitions, there must be some termination
conditions.  What are the termination conditions here?  
Differentiation of a variable, say %3x%*, with respect to %3x%* is defined to be 1;
differentiating a constant, or a variable not equal to %3x%* with
respect to %3x%* gives a result of zero.  
This problem should begin to sound like that of the %aIND%*-definitions of sets
(in this case the set of polynomials) and the associated %aREC%*-definitions
of algorithms (in this case differentiation of polynomials).
If this %2is%* the mold into which our current problem fits, then our first order
of business is to give an inductive definition of  our set of polynomials.
Though polynomials can be arbitrarily complex, involving the operations of addition,
multiplication, negation, and exponentiation,  their general format is very simple
if they are described in our LISP-like notation where the operation 
precedes its
operands. If we assume that binary plus, times, and exponentiation are 
symbolized by +, *, and ↑, we will
write %3+[x;2]%* instead of the usual infix notation  %3x+2%*.
The general term for this LISP-like notation is %2⊗→prefix notation↔←%*.

Here are some examples of infix and prefix representations:
.GROUP SKIP 1;
.BEGIN TABIT2(30,45);
.GROUP
\%2infix\prefix
%3

\x*z+2y\+[*[x;z]; *[2;y]]
\x*y*z\*[x;*[y;z]]
%1
.APART
.END

We will now give an inductive definition of the set of polynomials
we wish to consider. The definition will involve an inductive definition
of terms.

.BEGIN INDENT 0,5;GROUP;
%21.%*  terms are polynomials.

%22.%*  If p%41%* and p%42%* are polynomials then the "sum"
of p%41%* and p%42%* is a polynomial.
.END

.GROUP
where:
.BEGIN INDENT 0,5;
%21.%*  constants and variables are terms.

%22.%*  If t%41%* and t%42%* are  terms then the "product" 
of t%41%* and t%42%* is a term.

%23.%*  If t%41%* is a variable and t%42%* is a constant then 
"t%41%* raised to the t%42%*%8th%1 power" is a term.
.END

%24.%*  If t%41%* is a term the "minus"
t%41%*  is a term.
.END
.APART

Armed with prefix notation we can now give a BNF description
of the above set:


.BEGIN TABIT1(13);GROUP;
.P149:

<poly>\::= <term> | <plus>[<poly>;<poly>]

<term>\::= <constant> | <variable> | <times>[<term>;<term>] | <expt>[<variable>;<constant>]
\::= <minus><term>

<constant>\::= <numeral>  

<plus>\::= + 

<times>\::= * 

<expt>\::= ↑

<minus>\::= -

<variable>\::= <identifier>

.END

It is easy to write recursive
algorithms in LISP; the only problem is that the domain (and
range) of LISP functions is S-exprs, not the polynomials which
we need.  So we need a way to represent arbitrary polynomials as S-exprs.
We will do the representation
in lists rather than  S-exprs.
Let %eR%* be a function mapping polynomals to their representation such
that 
a variable is mapped to its uppercase counterpart in the 
vocabulary of LISP atoms. Thus:
.BEGIN CENTER;
%eR%f(%1<variable>%f)%1#=#<literal atom>.
.END
Let constants (numerals), be just the LISP numerals;
these are also respectable LISP atoms.  Thus:
.BEGIN CENTER;
%eR%*%f(%1<numeral>%f)%1#=#<numeral>.
.END

Since we have now specified a representation for the base domains of the
inductive definition of our polynomials, it is reasonable to think about the
termination cases for the recursive definition of differentiation.

.GROUP
We know from differential calculus that
if %3u%* is a constant or a variable then:

.BEGIN TABIT2(10,20);
\d%3u%*/d%3x%* =\1 if %3x = u
\\%10 otherwise
.END
.APART

.GROUP
We
will represent the d-operator as a binary  LISP function named %3diff%*.
The application, d%3u%*/d%3x%* will be represented as %3diff[u;x]%*.
Now since constants and variables are both represented as atoms,
we can check for both of these cases by using the predicate %3isindiv%*.
Thus a representation of the termination cases might be:

.BEGIN TABIT2(10,20);
.P182:
\%3diff[u;x] <= [isindiv[u] → [eq[u;x] → 1; %et%* → 0] ....%*
.END
.APART
Notice we write the abbreviation, %3isindiv%* instead of %3isindiv%4r%1.
You should be a bit wary of our definition already: %3diff[1;1]%* will
evaluate  to %31%*.


Now that we have covered the termination case, what can be done for the
representation of the remaining class of terms and polynomials? 
That is, how should we represent sums and products?

First, we will represent the operations *, +, -, and ↑ as atoms:

.BEGIN CENTERIT;GROUP;

←%eR%*%f(%1 + %f)%1 = %3PLUS%1
←%eR%*%f(%1 * %f)%1 = %3TIMES%1
←%eR%*%f(%1 - %f)%1 = %3MINUS%1
←%eR%*%f(%1 ↑ %f)%1 = %3EXPT%1
%1
.END

We will now extend the mapping %eR%* to occurrences of binary operators by mapping
to three-element lists:

.BEGIN CENTERIT;
←%eR%*%f(%3 %9α%3[%9β%41%3;%9β%42%3] %f)%1 = %3(%eR%f(%9α%f)%3, %eR%f(%9β%41%f)%1, %eR%f(%9β%42%f)%3).
.END

.BEGIN CENTERIT;GROUP
Unary applications will result in two-element lists:

←%eR%*%f(%3 %9α%3[%9β%3] %f)%1 = %3(%eR%f(%9α%f)%3, %eR%f(%9β%f)%3).
.END

.BEGIN CENTERIT; GROUP;
%1For example:←%eR%f(%3 +[x; 2] %f)%1 =  %3(PLUS X 2)%*
.GROUP
A more complicated example: the polynomial,
%3

←x%82%* + 2yz + u
%1
.APART
will be translated to the following prefix notation:

%3
←+[↑[x;2]; +[*[2;*[y;z]]; u]]     ⊗↓%1This is messier than it really needs to 
be because we assume that + and * are binary. You should also notice
that our %eR%*-mapping is applicable to a larger class of expressions
than just <poly>. Look at %3(x#+#y)*(z#+#2).←  
.FILL
%1

From this it's easy to get the list form:
.NOFILL
%3
←(PLUS (EXPT X 2)  (PLUS (TIMES 2 (TIMES Y Z)) U))
%1
.FILL
%1
.BEGIN TABIT3(10,28,39);CENTERIT;
.GROUP
Now we can complete the differentiation algorithm for + and *.  We know:

\d[%3f + g%*]/d%3x%* = d%3f/%*d%3x + %*d%3g%*/d%3x.
%1
.APART

.GROUP
We would see:←%3u = %eR%f(%3 f + g %f)%1  = %3(PLUS, %eR%f(%3 f %f)%3, %eR%f( %3g %f)%3)%1
Where:\%3second[u] = %eR%f( %3f %f)%1 and, %3
\third[u] = %eR%f(%3 g %f)%1.        ⊗↓As we intimated earlier, we have done an evil thing here.
We have tied the algorithm for symbolic differentiation to a specific 
representation for polynomials. It is useful to use a specific representation
for the moment and repent later. In particular, see {yon(P74)}, {yon(P60)}
and {yon(P73)}.←
.APART


.GROUP;
.FILL
The result of differentiating %3u%* is to be a new list of three
elements:
.NOFILL

\1.  the symbol %3PLUS%*.
\2.  the effect of %3diff%* operating %eR%f( %3f %f)%1
\3.  the effect of %3diff%* operating %eR%f( %3g %f)%1
.APART

.GROUP
Thus another part of the algorithm:

%3
\eq [first[u];PLUS] →\list [PLUS; diff[second[u];x];diff[third[u];x]].
%1.


d[%3f*g]%*/d%3x%* is defined to be %3f* %1d%3g%*/d%3x + g *%1d%*f/%1d%*x%1.
.APART


.GROUP
So here's another part of %3diff%*:
%3
\eq[first[u];TIMES] →\list[PLUS; 
\\     list[TIMES; second[u];diff [third[u];x]];
\\     list[TIMES;third[u];diff [second[u];x]]]
%1
.APART

.GROUP;
Finally, here's an example. We know:

←d[%3x*y + x]%*/d%3x = y + 1%*

.END
.GROUP
Try:
%3
.BEGIN TABIT3(10,15,19);
diff [(PLUS (TIMES X Y) X); X]
\=list [PLUS; diff[(TIMES X Y); X]; diff [X;X]]
\=list [PLUS; 
\\list [PLUS; 
\\\list [TIMES; X; diff [Y;X]];
\\\list [TIMES; Y; diff [X;X]]];
\\diff [X;X]]

\=list [PLUS;
\\list [PLUS;
\\\list [TIMES; X ;0];
\\\list [TIMES; Y;1]];
\\1 ]

.APART
\=(PLUS  (PLUS (TIMES X 0) (TIMES Y 1)) 1)

.END
%1
which can be interpreted as:

%3
←x*0 + y*1 + 1 .
%1

Now it is clear that we have the right answer; it is equally clear that
the representation leaves much to be desired. There are obvious
simplifications which we would expect to have done before we would
consider this output acceptable. This example is a
particularly simple case for algebraic simplification. We can easily
write a LISP program to perform simplifications like those expected here:
like replacing %30*x%1 by %30%*, and %3x*1%* by %3x%*. But the general
problem of writing simplifiers, or indeed of recognizing 
what is a simplification, is quite difficult.
A whole branch of computer science has grown up around 
symbolic and algebraic manipulation of expressions. One of the
crucial parts of such an endeavor is a sophisticated simplifier.

.END
.<<abstracting from rep>>
.CENT(Points to note)

.SELECT 1;
This problem of representation is typical of data structure
algorithms (regardless of what language you use).  That is,
once you have decided what the intuitive problem is, pick a
representation which makes your algorithms clean (only later should you worry
about efficiency).  In the next set of examples we will see a
 series of representations, each becoming more and more "efficient"
and each requiring more "knowledge" being built into the algorithm.

.GROUP
Here's the whole algorithm for differentiation using + and *.
%3
.BEGIN TABIT3(10,33,37);

diff[u;x] <=
\[isindiv[u] → [eq [x;u] → 1; %et%* → 0];
\ eq [first [u]; PLUS] → list\[PLUS; 
\\ diff [second [u]; x];
\\ diff [third [u]; x]];
\ eq [first[u]; TIMES] → list\[PLUS; 
\\ list\[TIMES; 
\\\ second[u];
\\\ diff [third [u]; x]];
\\ list\[TIMES; 
\\\ third [u];
\\\ diff [second[u]; x]]];
\ %et%* → %9B%*]
.END
.APART
.select 1;
.P74:

As we mentioned earlier, the current manifestation of %3diff%* encodes too
much of our particular representation for polynomials. The separation
of algorithm from representation is beneficial from at least two standpoints.
First, changing representation should have a minimal effect on the structure
of the algorithm, but %3diff%* %2knows%* that variables are represented as atoms
and a sum is represented as a list whose %3first%*-part is %3PLUS%*.
Second, readability of the algorithm suffers greatly.

How much of %3diff%* really needs to know about the representation and
how can we improve the readability of %3diff%*?

To begin with the uses of %3first%*, %3second%*, and %3third%* are not 
particularly mnemonic⊗↓It must be admitted, however, that they are
more readable than %3car-cdr%*-chains.←. We used
%3second%* to get the first argument to a sum or product and used %3third%*
to get the second.
We used %3first%* to extract the operator.

.GROUP;
Let's define the selectors:

.BEGIN CENTERIT;SELECT 3;group;
←op[x] <= first[x]
←arg%41%*[x] <= second[x]
←arg%42%*[x] <= third[x]
.END
.APART
.GROUP

Then %3diff%* becomes:

.BEGIN TABIT3(10,31,35);select 3;
.GROUP

diff[u;x] <=
\[isindiv[u] → [eq [x;u] → 1; %et%* → 0];
\ eq [op[u]; PLUS] → list\[PLUS; 
\\ diff [arg%41%* [u]; x];
\\ diff [arg%42%* [u]; x]];
\ eq [op[u]; TIMES] → list\[PLUS; 
\\ list\[TIMES; 
\\\ arg%41%* [u];
\\\ diff [arg%42%* [u]; x]];
\\ list\[TIMES; 
\\\ arg%42%* [u];
\\\ diff [arg%41%* [u]; x]]];
\ %Et%* → %9B%*]
.APART
.END
.APART

Still, there is much of the representation present. Recognition of variables
and other terms can be abstracted from the representation. We need only
recognize when a term is a sum, a product, a variable or a constant.
  To test for the occurrence of a numeral we shall assume
a unary LISP predicate called %3numberp%* which returns %et%* just in the case
that its argument is a numeral.
Then,
in terms of the current  representation, we could define such recognizers or
predicates as:

.BEGIN CENTERIT; SELECT 3;group;

←issum[x] <= eq[op[x];PLUS]
←isprod[x] <= eq[op[x];TIMES]
←isconst[x] <= numberp[x]
←isvar[x] <= [isindiv[x] → not[isconst[x]]; %et%* → %ef%*]
←samevar[x;y] <= eq[x;y]

.END
Using these predicates we can rewrite %3diff%* as:

.BEGIN TABIT3(10,23,27);select 3;
.GROUP

diff[u;x] <=
\[isvar[u] → [samevar[x;u] → 1; %et%* → 0];
\ isconst[u] → 0;
\ issum[u] → list\[PLUS; 
\\ diff [arg%41%* [u]; x];
\\ diff [arg%42%* [u]; x]];
\ isprod[u] → list\[PLUS; 
\\ list\[TIMES; 
\\\ arg%41%* [u];
\\\ diff [arg%42%* [u]; x]];
\\ list\[TIMES; 
\\\ arg%42%* [u];
\\\ diff [arg%41%* [u]; x]]];
\ %et%* → %9B%3]
.APART
.END

Readability is certainly improving, but the representation is still
known to %3diff%*.
When we build the result of the sum or product of derivatives we use
knowledge of the representation.
Rather it would be better to define:
.BEGIN CENTERIT;SELECT 3;group;

←makesum[x;y] <= list[PLUS;x;y]
←makeprod[x;y] <= list[TIMES;x;y]
.END
.GROUP;
Then the new %3diff%* is:

.BEGIN TABIT2(10,28);select 3;
.GROUP
.p101:

diff[u;x] <=
\[isvar[u] → [samevar[x;u] → 1; %et%* → 0];
\ isconst[u] → 0;
\ issum[u] → makesum[diff [arg%41%* [u]; x];diff[arg%42%* [u]; x]];
\ isprod[u] → makesum[\makeprod[arg%41%* [u];diff[arg%42%* [u]; x]];
\\ makeprod[arg%42%* [u];diff [arg%41%* [u]; x]]];
\ %et%* → %9B%3]
.APART
.END
.APART

Notice that %3diff%* is much more readable now and, more importantly, the details
of the representation have been relegated to subfunctions. Changing
representation simply requires supplying different subfunctions. No changes
need be made to %3diff%*. There has only been a slight decrease in efficiency.
The termination condition in the original %3diff%* is a bit more succinct.
Looking back, we first abstracted the selector 
functions: those which selected components; then we abstracted the recognizers:
the predicates telling which term was present; then we modified
the constructors: the functions which make new terms.
These three components of programming: selectors, recognizers, and constructors,
will appear again on {yon( P75)} in discussing McCarthy's abstract syntax.

The %3diff%* algorithm is abstract now, in the sense that the representation
of the domain and the representation of the functions and predicates which
manipulate that domain have been extracted out. This is our %9r%*-mapping
again; we mapped the domain of <poly>'s to lists and mapped the
constructors, selectors, and recognizers to list-manipulating functions.
Thus the data types of the arguments %3u%* and %3x%* are <poly> and <var>
respectively, %2not%* list and atom. To  stress this point we should make one more
transformation on  %3diff%*. We have frequently said that there
is a substantial parallel between a data structure and the algorithms which
manipulate it. Paralleling the BNF definition of <poly> on {yon(P149)}, we 
write:

.BEGIN TABIT2(10,28);select 3;
.GROUP

diff[u;x] <=
\[isterm[u] → diffterm[u;x];
\ issum[u] → makesum[diff [arg%41%* [u]; x];diff[arg%42%* [u]; x]];
\ %et%* → %9B%3]
.APART

.GROUP
diffterm[u;x] <=
\[isconst[u] → 0;
\ isvar[u] → [samevar[x;u] → 1; %et%* → 0];
\ isprod[u] → makesum[\makeprod[arg%41%* [u];diff[arg%42%* [u]; x]];
\\ makeprod[arg%42%* [u];diff [arg%41%* [u]; x]]];
\ %et%* → %9B%3]
.APART
.END

.GROUP;
To satisfy our complaint of {yon(P182)} that %3diff[1; 1]%1 gives a defined
result, we should also add:

.BEGIN TABIT2(10,22);SELECT 3;GROUP;

\diff%9'%*[u; x] <= [\isvar[x] → [ispoly[u] → diff[u; x]]; %et%3 → %9B%3]
.END
.APART

Finally, notice that our abstraction process has masked the order-dependence
of conditional expressions. Exactly one of the recognizers will be satisfied
by the form  %3u%*.



.CENT(Problems)
%21.%* Extend the version of %3diff%* of your choice to handle differentiation
of powers such as  %3↑[x; 3]%1.

%22.%1 Extend %3diff%* to handle unary  minus.

%23.%1 Extend %3diff%1 to handle differentiation of the trignometric functions,
%3sin%1 and %3cons%1 and their composition with polynomials.
For example it should handle %3sin%82%3x#+#cos(x%83%3#+#5x#-2)%1.

%24.%1 Write an algorithm to handle integration of polynomials.
.SS(Data Bases,,P220:)

One of the more intriguing applications of LISP is in the area of
data base management. 
In {yonss(P222)} we will discuss data bases in a more general
setting.  In this section we introduce the ideas and suggest
how LISP can be applied to the problems.

A data base is a collection of objects
together with a set of functions to pose questions about the
objects in the base, to select objects from the base, and to construct new
entries in the base. That is, a data base is an abstract data structure.
We need to locate information in the base.
We should be able to ask the system for  a  specific object or
we should be able to partially specify our request ("find all
books about LISP" or "find all books about LISP published before 1975").
We should be able to  add entries and delete entries, but we will postpone
these kinds of requests until  later.

The representational details of objects will be suppressed as usual, and
we will concentrate on the abstract properties.
In our first example, the objects in the data base will represent
constants: an object will have a name and a collection of properties and
values. 

.GROUP
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;GROUP;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";

     	⊂αααααααπααααα⊃
        ~ prop1 ~ val1~
	εαααααααβαααααλ
        ~ prop2 ~ val2~
             # # #
	εαααααααβαααααλ
        ~ propn ~ valn~
       	%ααααααα∀ααααα$  		  
.END
.CENT(An object representation)
.APART
For example, a  data base dealing with business supplies
might have objects named boxes. Each box has properites like 
size and contents.

Not all objects need to have the same number of properties. For
example in a data base  whose objects are bibliographic references,
books need not have page references, whereas journal articles do;
and journal references don't include a publisher whereas books do.
The programs which manipulate the data base must be structured to
take changeablility into account.

.GROUP;
Here are some examples: the first one was extracted from the side of a
Xerox paper box; the second might be a representation of a bibliographic entry
for this book.
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;GROUP;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";

     	⊂αααααααπαααααααααααα⊃
        ~ NAME  ~  4029258   ~
	εαααααααβααααααααααααλ
        ~ SIZE  ~ 8-1/2 x 11 ~
	εαααααααβααααααααααααλ
        ~ COLOR ~   WHITE    ~
	εαααααααβααααααααααααλ
        ~ AMNT  ~  10 REAMS  ~
       	%ααααααα∀αααααααααααα$  		  
.END
.APART
.BEGIN SELECT b;NOFILL;PREFACE 0 MILLS;GROUP;
.TURN OFF "%","#","α","←";
.TURN ON "∞" FOR "%";

     	⊂ααααααααπααααααααααααααααααααα⊃
        ~ AUTHOR ~     ALLEN, JOHN, R. ~
	εααααααααβαααααααααααααααααααααλ
        ~ TITLE  ~ THE ANATOMY OF LISP ~
	εααααααααβαααααααααααααααααααααλ
        ~ TYPE   ~          BOOK       ~
	εααααααααβαααααααααααααααααααααλ
        ~ PUBL   ~     MCGRAW-HILL     ~
	εααααααααβαααααααααααααααααααααλ
        ~ DATE   ~        1977	       ~
       	%αααααααα∀ααααααααααααααααααααα$  		  
.END
Given a data base of objects, we need to be able to manipulate these objects in
meaningful ways. We will not address the problems of designing
input and output, but will concern ourselves solely with the problems
of semantics data base primitives: how can we use the information in the 
base?


In requesting information from a data base, we typically  specify
part of the request and expect the system to come up with a set of possibilities
which fit our description. For example, the request: "find all books
about LISP", specifies that we are interested only in books, not in journal
articles or course notes; the topic is specified to be LISP, but 
the system is free to select the other components: the author, the title,
the publisher and the date of publication. The objects which are specified
are called %2constants%1, the unspecified components are %2variables%1.
A request is a structure  called a %2pattern%1 and consists of
an ordered collection of constants and variables.
The elements in the data base are also patterns; for this example, they
contain only constants; such constant patterns are also called records.
The process of discovering whether or not a record in the data base
matches the request is called %2pattern matching%1.


We describe a simple pattern matcher named %3match%1. It expects
two arguments. The first argument is a constant pattern called %3pat%1.
The second argument, %3exp%1
represents a request; it may be constant, or it may contain variables.
If it does contain variables, then  the pattern matching process
must establish a match between those variables and components of our
data base object. The value returned by %3match%1 will either 
represent  the associations built up to match the constant pattern to the
expression, or the value returned will  indicate failure if
no match is possible. 

Patterns will be represented as lists  where atoms represent constants
and variables will be represented as lower-case greek letters.
We will represent failure by returning the atom %3NO%1. 
In the case that a match is possible, we will  return a list of 
pairs, where each pair is a variable and its matching constant.
.GROUP;
For example:
.BEGIN SELECT 3;GROUP;CENTER;
match[(A (B C));(A (B %7a%3))] = ((%7a%3 C))
match[(A B C);(A %7a%3 %7b%3)] = ((%7a%3 B) (%7b%3 C))
match[(A B C);(A C %7b%3)] = NO
.END
.APART

.GROUP;
Pattern matching can become quite complex. For example:
.BEGIN SELECT 3;GROUP;CENTER;
match[(A (B C) (D C));(A (B %7a%3) (%7b%3 C))] = ((%7a%3 C) (%7b%3 D))
match[(A (B C) (D C));(A (B %7a%3) (%7a%3 C))] = NO
.END

The second example fails since
once we have associated %3C%1 with %7a%1 we must use that association
throughout the rest of the pattern match; and %3(D#C)%1 does not match
%3(%7a%3#C)%1 when %7a%1 denotes %3C%1.
.APART
We will write %3match%1 in terms of a subfunction named %3match%9'%1.
This subfunction carries a third argument, %3mlist%1, which represents the
list of 
partial matches. Whenever we locate a variable in the expression, we
examine the current %3mlist%1. If the variable appears, then we must check
its entry against the corresponding part of the pattern. If the variable
does not occur in %3mlist%1, then we associate the variable with 
the appropriate part of the constant pattern.
.BEGIN SELECT 3;GROUP;TABIT2(22,32);
.P229:

match[pat;exp] <= match%9'%3[pat;exp;( )]

match%9'%3[pat;exp;mlist] <= \[equal[mlist;NO] → NO;
\ isconst[pat] → [sameconst[pat;exp] → %et%3; %et%3 → NO];
\ isvar[pat] → check[pat;exp;lookup[pat;mlist];mlist];
\ %et%3 → match%9'%3[\suffix[pat];
\\suffix[exp];
\\match%9'%3[prefix[pat];prefix[exp];mlist]]
.END

.BEGIN SELECT 3;GROUP;TABIT1(24);

check[var;exp;val;mlist] <= \[not[val] → concat[mkent[var;exp];mlist];
\ sameconst[exp;val] → mlist;
\ %et%3 → NO]
.END

.BEGIN SELECT 3; GROUP;TABIT1(16);

lookup[var;l] <=\[null[l] → %3f%3;
\ samevar[var;name[first[l]]] → val[first[l]];
\ %et%3 → lookup[var;rest[l]]]
.END
To complete our description of %3match%1 we should supply the data structure
manipulating functions: %3isconst%1, %3isvar%1, %3prefix%1, %3suffix%1, 
%3samevar%1, and %3sameconts%1;
and %3mkent%1, %3name%1, and %3val%1. The first five are related, dealing with
the representation of patterns; the final three involve the representation
of the match list. Note that we %2have%1 assumed that %3mlist%1 is a list.
We leave it to the reader  to supply representations of these eight functions.

Given a basic pattern matcher, we can begin to elaborate on a
data base management system. We need some means of controlling the matcher.
If several entries in the system match the inquiry, then we must decide how to
manage the matches. In simple cases we could  make a list of all the possibilities.
If the number of matches is very large we might want to  return a few
at a time, remembering where we were in the search of the base.
The natural extension of this idea
is to allow a potentially infinite set of elements present in the data base. 
In programming languages we are able to talk about such potentialities by
using a procdeure.

Instead of having objects explicitly
stored in the base, we may allow procedures to occur as data base elements.
Such a procedure would generate elements. For example, instead of storing 
the integers as explicit objects, we could store a procedure to generate
the integers. This introduces two problems: how do we store procedures as
data objects;  and, assuming that we have called such a procedure and
it has delivered an explicit object, how do we represent the
notion that the %2next%1 time we call that procedure, we want the
%2next%1 object?  That is, a procedure named %3get_next_integer%1
should return %31%1 the first time it is called, but know to return
%32%1 the next time it is called in the same context.

Other possible extensions involve the operations on the base.
Assume that
we know that "all roses are red" and we know that object O%41%1 is
a rose; if we ask the data base for all red objects, we should expect
to see O%41%1 appear as a candidate. That expectation requires
a deductive ability built into the base manipulator. That is, we need not have
explicitly stored information in the base, but we expect to be able to
deduce facts from information in the base using some relationships
and reasoning ability.

There are at least two ways the "roses are red" problem can be solved.
Notice that "all roses are red" is much like a procedure; given an
object which is a rose, it generates an object which is red. So,
on entering a rose object in the data base, the system could
also explicitly add the fact that the rose was red. This is an example
of an input demon. A demon is a procedure which is not explicitly called
but is activated by the occurrence of another event.
When ever an object is added to the base the collection of input demons
is checked. If an applicable demon is found, it is activated; its activation
might activate other demons. 

The activation of a demon is a different kind of procedure call than
previously seen. The activation is done on pattern matching and
thus is generally know by the term %2⊗→pattern directed invocation↔←%*.
The demon procedure is stored in the data base along with a pattern
which determines conditions for its activation. In the case of an input
demon, an input to the base initiates a match of the input demon patterns
against the input. If a match is found, the procedure body is executed
typically using some of the match information.

.GROUP;
Let's establish some notation and give an example.
To introduce records to our system we use a unary procedure named %3add_item%1.
The argument to %3add_item%1 is the record we wish to add.

%3add_item[(ROSE O1)]%1

We will use a ternary procedure named %3add_demon%1 to insert demons
in the base. The first argument is the type of demon; so far we  have
discussed demons invoked by adding elements; we will also
have demons which are applied when items are removed, or when items are
accessed. These three types will be named %3ADD%1, %3REMOVE%1, and %3FETCH%1.
The second argument is the pattern which will invoke this demon; and
the third argument is the action to be taken if the pattern matches.
For example:

%3add_demon[ADD;(ROSE %7a%3);add_item[(RED %7a%3))]]%1


.APART
Demons are also used to monitor the removal of information from the base.
For example

*****do it****

The third use of demons is involved with another possible solution to the
"all roses are red" problem.
Instead of explicitly adding the fact that %3O1%1 is a red object
we might wait until a request for red objects occurs. At that time
we could use the "all roses are red" demon %2backwards%1.
That is, we could look for any roses in the  data base; the assertion
that and rose object is also a red object allows us to accept
rose objects as solutions to our inquiry.
This feature introduces a certain deductive capability to our system.
It also introduces some organizational problems.

We have to be able to  recognize when a procedure is capable
of producing objects of the desired type. We therefore
index these data base procedures by a pattern which tells what
the procedure accomplishes. That pattern is called the procedure's goal
and the invocation of such a procedure is again pattern-directed, but
has an added connotation of being %2goal-oriented%1.

.GROUP;
Again, we introduce some notation and an example. Let the request
for a data base item be given by:
.BEGIN CENTER;SELECT 3;
%3fetch[%7a%3]%1, where %7a%1 is a pattern.
.END
Since a  %3fetch%1 request might discover several possibilities,
some being items  and some being goal-directed procedures, we 
need a way of examining the selected information.

We introduce a  function named %3try_next%1, whose single
argument is the result of a %3fetch%1. Each call on %3try_next%1
either produces a new item or signals that no more items exist
on the fetch list. 

.APART
An extension to this basic data base manipulating system has become 
convenient in artificial intelligence research. Let us assume we wish to
derive a plan or scheme for achieving a desired goal. In the 
derivation process we will make hypotheses and then pursue
their implications. A similar  behavior can be simulated if we allow
the creation of multiple data bases. Each base corresponds to a
hypothetical situation or world, and the  %3fetch%1-ing of
an object in a world corresponds to asking whether or not a 
desired state is attainable in that world. 

Instead of requiring that
all transformations occur in one data base, several systems 
(⊗↑[Con#73]↑, ⊗↑[QA4#73]↑) have implemented a layered data base.
In this situation we are able to add, delete and fetch from specified
data bases. We add two operations %3push_base%1 and %3pop_base%1
which allow us to  manipulate whole data bases as objects.

The control structures necessary handling such data base manipulations 
are highly non-recursive and will be discussed in {yonss(P222)}.

****indexing function?***

.CENT(Problems)
I.  Recall our discussion of %3match%1 on {yon(P229)}. 
Supply a representation for match lists and supply the eight
data structure functions.

II. The %3match%1 routine we developed on {yon(P229)} required that %3pat%1
be a constant pattern. Write a more general pattern matcher named
%2unify%1 which allows either %3pat%1 or %3exp%1 to contain variables.
This more gereral match routine is called a unifier (⊗↑[Rob#65]↑).
.GROUP;
For example:
.BEGIN CENTER;SELECT 3;
unify[(A (B %7a%3) A); (A (%7b%3 D) %7d%3)] = ((%7a%3  D)(%7b%3 B) (%7d%3 A))

unify[(A (B %7a%3) A); (A (%7b%3 D) %7b%3)] = NO

unify[(%7a%3 A  %7a%3); (%7b%3 %7b%3 B)] = NO
.END
.APART
.SS(Algebra of polynomials)
.BEGIN TURN ON "#";

%1
Assume we want to perform addition and multiplication
of polynomials and assume that each polynomial is of the
form %3p%41%* + p%42%* + ... + p%4n%1 where each 
term, %3p%4i%1, is a product of variables
and constants.  
The two components of each term are a constant part called the coefficient,
and the variable part.
We shall assume without loss of generality that the 
set of variables which appear in the polynomials are lexicographically
ordered, e.g.#%3x#<#y#<#z,#..., %1and assume that each variable part 
obeys that ordering; thus we would insist that %3xzy%82%1 be written %3xy%82%3z%1.
We do not assume that the terms are ordered within the polynomial; thus
%3x#+#xy%1 and %3xy#+#x%1 are both acceptable.
We further assume that the variables of
each %3p%4i%1 be distinct and that no %3p%4i%1 have %30%* as coefficient.
The standard algorithm for addition of
%9S%8n%4i=1%3p%4i%1 with   %9S%8m%4j=1%3q%4j%1 says you can combine a
%3q%4j%1 with a %3p%4i%1 if the variable parts of these terms 
are identical.  In this
case the resulting term has the same variable part  but has a
coefficient equal to the sum of the coefficients of %3p%4i%1
and %3q%4j%1.
For example if %3p%4i%1 is %32x%* and %3q%4j%1 is %33x%* the sum term is %35x%*.
We will examine four representations of polynomials, before finally
writing any algorithms. To aid in the discussion we will use
the polynomial %3x%82%* - 2y - z%1 as our canonical example.

.CENT(First representation:)

We could use the representation of the differentiation
example.  
This would represent our example as:

.BEGIN CENTERIT;SELECT 3;
←(PLUS (TIMES 1(EXPT X 2)) (PLUS (TIMES -2 Y) (TIMES -1 Z)))
.END

The above conventions specify an unambiguous representation for our class
of polynomials. Strictly speaking we did not need to impose
the ordering on the set of variables. 
However, we need to impose some additional constraints before we
have data structures which are well-suited to the class of polynomial
algorithms we wish to represent.

.CENT(Second representation:)
We are really only interested in testing the equality of the variable parts;
we will not be manipulating them in any other way.
So we might simply represent the variable part as a list of pairs;
each pair contains a variable name and the corresponding value of the exponent.
Using our knowledge of the forms of polynomials and the class
of algorithms we wish to implement, we write
%9S %3p%4i%1 as:

.BEGIN CENTERIT;
←%3( (%1rep of %3p%41%*), (%1rep of %3p%42%*), ...)%1
.END

which would make our example look like:

.BEGIN CENTERIT;SELECT 3;

←((TIMES 1 ((X . 2))), (TIMES -2 ((Y . 1))), (TIMES -1 ((Z . 1))))
.END

Is this representation sufficient?  Does it have the 
flexibility we need?  It  %2does%* suffice but it is still not terribly 
efficient. We are ignoring too much of the structure in our class of 
polynomials.

.CENT(Third representation:)

What do we know? We know that the occurrence of variables is 
ordered in each variable part; we can assume that we know the class of 
variables which
may appear in any polynomial. So instead of writing %3x%82%*y%83%*z%1
as

.BEGIN CENTERIT;
←%3((X . 2) (Y . 3) (Z . 1))%*, we could write:


←%3(2 3 1)%*   assuming %3x, y, z%* are the only variables.
.END

In a further simplification, notice that the  %3TIMES%* in the
representation is superfluous. We %2always%* multiply the coefficient by
the variable part. So we could simply %3cons%* the coefficient 
onto the front of the variable part representation.

.BEGIN NOFILL;TURN ON "\";TABS 30,45;GROUP;
Let's stop for some examples.

\%2term\representation
\%32xyz\(2 1 1 1)
\2x%82%*z\(2 2 0 1)
\4z%83%*\(4 0 0 3)
.END

Thus our canonical polynomial  would now be represented as:

.BEGIN CENTERIT;SELECT 3;
←((1 2 0 0) (-2 0 1 0) (-1 0 0 1))
.END

This representation is not too bad; the %3first%*-part of any
term is the coefficient; the %3rest%*-part is the variable part. So,
for example the test for equality of variable parts is simply a call on %3equal%*.

Now let's start thinking about the structure of the main algorithm.

.CENT(Fourth representation:)

The algorithm for the sum must compare terms; finding like terms it will
generate an appropriate new term, otherwise simply copy the terms.
When we pick a %3p%4i%1 from the first polynomial we would
like to find a corresponding %3q%4j%1 with the minimum amount of
searching.  
This can be accomplished if we can order the terms
in the polynomials. 
A natural ordering can be induced on  the terms  by
ordering  the numerical representation of the exponents.
For sake of argument, assume 
that a maximum of two digits will be needed to express
the exponent of any one variable. 
Thus the exponent of %3x%82%1 will be represented as %302%*, or 
the exponent of %3z%810%1 will be represented as %310%*. Combining this with our
ordered representation of variable parts, we arrive at:
.GROUP
.BEGIN NOFILL;TABS 30,45;TURN ON "\";

\%2term\representation

.SELECT 3;
\43x%82%*y%83%*z%84\%3(43, 020304)
\2x%82%*z\%3(2, 020001)
\4z%83%*\(4, 000003)
.END
.APART
%1

Now we can order on the numeric representation of the variable
part of the term.  One more change of representation, which will
result in a simplification in storage requirements:

.BEGIN CENTERIT;
←represent %3 ax%8A%3y%8B%3z%8C%1 as %3(a . ABC).
.END

.SELECT 1;
This gives our final representation.

.BEGIN CENTERIT;SELECT 3;
←((1 . 20000) (-2 . 100) (-1 . 1))
.END

Note that %320000 > 100 > 1%1.
  

Finally we will write the algorithm.  We will assume that
the polynomials are initially ordered and will write the 
algorithm so as to maintain that ordering.
Each term is a dotted pair of elements: the coefficient
and a representation of the variable part.

.P60:
As in the previous differentiation example, we should 
attempt to extract the algorithm
from the representation.
.GROUP
We shall define:

.BEGIN CENTERIT;
←%3coef[x] <= car[x]%1 and %3 expo[x] <= cdr[x].
.END
%1

To test the ordering we will use the LISP predicate:

.BEGIN CENTERIT;
←%3greaterp[x;y] = x>y.
.END
%1
.APART

In the construction of the `sum'
polynomial we will generate new terms by combining coefficients.
So a constructor named %3node%* is needed.
In terms of the latest representation %3node%* is defined as:

.BEGIN CENTERIT;
←%3node[x;y] <= cons[x;y].%1
.END

.GROUP
So here's a graphical representation of our favorite polynomial:

.BEGIN CENTERIT
←%3x%82%* - 2y - z %1
.END

%3
.BEGIN NOFILL;
.BEGIN SELECT g;NOFILL;PREFACE 0 MILLS;GROUP;
.TURN OFF "%","#";
.TURN ON "∞" FOR "%";
			    /\ 
			   /  \
			  /    \
			 /      \
			/\       \
		       /  \      /\
		      1  20000  /  \
			       /    \
			      /\     \
			     /  \     \
			   -2   100   /\
				     /  \
				    /   ∞3NIL∞*
				   /\
				  /  \
				 -1   1
.END
.APART
.GROUP
%1
Here's the algorithm:
%3
.BEGIN NOFILL;TABS 3,10,35;TURN ON "\";
polyadd[p;q] <=
\[null[p] →q; null[q] → p;
\ eq[expo[first[p]];expo[first[q]]] →\[zerop[plus[coef[first[p]];coef[first[q]]]]
\\\    → polyadd[rest[p];rest[q]];
\\\ %et%* → concat[node[plus[coef[first[p]];coef[first[q]]];
\\\                      expo[first[p]]];
\\\                 polyadd[rest[p];rest[q]]]];
\ greaterp[expo[first[p]];expo[first[q]]] → concat[first[p];polyadd[rest[p];q]];
\ %et%* → concat[first[q];polyadd[p;rest[q]]]]
.END
%1
.APART

.GROUP
Now for an explanation and example.

First we used some new LISP functions:

.BEGIN CENTERIT;
←%3plus[x;y] = x + y
←zerop[x]  <= eq[x;0]
.END
.APART
%1
.P72:

%3polyadd%* is of the form: %3[p%41%* → e%41%*; p%42%* → e%42%*; p%43%* → e%43%*; p%44%* → e%44%*; p%45%* → e%45%*]

.BEGIN SELECT 1;INDENT 0,10;FILL;
Case i: %3p%41%3 → e%41%1 and %3p%42%* → e%42%1. If either polynomial is empty return the other.

Case ii: %3p%43%* → e%43%1.  If the variable parts are equal then we can think about combining
terms.  However, we must check for cancellations and not include any 
such terms in our resultant polynomial.

Case iii: %3p%44%* → e%44%1 and %3p%45%* → e%45%1. These sections worry about the ordering of
terms so that the resultant polynomial retains the ordering.
.END

%1
Here's an informal execution of %3polyadd:
.GROUP

.BEGIN NOFILL;TABS 10;TURN ON "\";
polyadd[x+y+z;x%82%*-2y-z]
\= concat[x%82%*;polyadd[x+y+z;-2y-z]]
\= concat[x%82%*;concat[x;polyadd[y+z;-2y-z]]]
\= concat[x%82%*;concat[x;concat[node[1+-2;y];polyadd[z;-z]]]]
\= concat[x%82%*;concat[x;concat[-y;polyadd[z;-z]]]]
\= concat[x%82%*;concat[x;concat[-y;polyadd[( );( )]]]]
\= concat[x%82%*;concat[x;concat[-y;( )]]]
.BEGIN CENTERIT;
←= x%82%*+x-y
.END
.END
.APART
.END
.END
.CENT(Problems)
%21.%1 Write an algorithm, %3polymult%1, to perform the the multiplication
of two polynomials.
.SS(Evaluation of polynomials,,P2:)
.BEGIN TURN ON "←";
Though you are undoubtedly quite tired of looking at polynomials, there is
at least one more operation which is usefully performed on such creatures.
That operation is evaluation.
Given an arbitrary polynomial, and values for any of the variables which it
contains, we would like to compute its value. First we will
assume that  the substitutions of values for variables has already been
carried out. Thus we are dealing with polynomials of the form: %9S%8n%4i=1%3p%4i%1
where %3p%4i%1 is a product of powers of constants. For example:

←%32%83%* + 3*4%82%* + 5%1. 

This could be represented as:

←%3(PLUS (EXPT 2 3) (PLUS (TIMES 3 (EXPT 4 2)) 5)).%*

We have taken this general representation because we have great expectations
of generalizing the resulting algorithm.

We will now describe a LISP function, %3value%*, which will take such an
S-expr representation and compute its value. Input to %3value%* will be
numerals or lists beginning with either %3PLUS, TIMES,%* or %3EXPT%*.
The value of a numeral is that numeral; to evaluate the other forms of input
we should perform the operation represented. We must therefore assume
that operations of addition, multiplication, and exponentiation exist.  
Assume they are
named +, *, and ↑, respectively. What then should be the value of a representation
of a sum?
It should be the result of adding the value of the representations of the
two summands or operands. That is, %3value%* is  recursive.
It should now be clear how to write %3value%*:

.BEGIN TABIT1(10);SELECT 3;

value[x] <=\[isconstant[x] → x;
\ issum[x] → +[value[arg%41%*[x]];value[arg%42%*[x]]];
\ isprod[x] → *[value[arg%41%*[x]];value[arg%42%*[x]]];
\ isexpt[x] → ↑[value[arg%41%*[x]];value[arg%42%*[x]]]]

.END
where:

.BEGIN GROUP;CENTERIT;SELECT 3;
←isconstant[x] <= numberp[x]
←issum[x] <= eq[first[x];PLUS]
←isprod[x] <= eq[first[x];TIMES]
←isexpt[x] <= eq[first[x];EXPT]
.END
.END
.<<PROBLEMS ON POLY EVAL >>
.SELECT 1;
.CENT(Problems)
1.  Show how to extend %3 value%* to handle binary and unary minus.

2.  Write an algorithm  %3instantiate%*  which will take two arguments,
one representing a set of variables and values, the other representing
a polynomial. The algorithm is to return a representation of the
polynomial which would result from substituting the values for the variables.

3.  It would be nice if we could represent  expressions like 2+3+4 as
%3(PLUS#2#3#4)%* rather than %3(PLUS#(PLUS#2#3)#4)%* or %3(PLUS#2#(PLUS#3#4))%*;
or represent 2*3*4+5+6 as %3(PLUS#(TIMES#2#3#4)#5#6)%*.

Write a new version of %3value%* which can evaluate  such n-ary representations
of + and *.

.CENT(More on polynomial evaluation)
Though it should be clear that the current %3value%* function does perform
the appropriate calculation, it should be equally clear that the class of
expressions which %3value%* handles is not particularly powerful.
We might wish to evaluate  requests like:

.BEGIN tabit1(15);
.P122:

%aA%*\"What is the value of %3x*y + 2*z%* when %3x=4, y=2,%1 and %3z=1%*?"
.END

Now the function %3instantiate%*, requested in problem 2  above, offers
one solution: make a new copy of the representation of %3x*y#+#2*z%*
with the variables physically replaced by their values. This would result
in a representation of %34*2#+2*1%*, and this new expression is suitable
fare for %3value%*. Computationally, this is a terrible solution.
%3instantiate%* will go through the structure of the expression looking
for instances of variables, and when located, will replace them with
the  appropriate values. %3value%* then goes through the structure of
the resulting expression performing the evaluation. Clearly what is
desireable is a function %3value%9'%1 combining the two processes: 
the basic structure of %3value%9'%1 is that of mild-mannered %3value%*,
but when a variable, say %3x%*, is recognized inside %3value%9'%1 then we would
hope that it looks at a table like that expected by %3instantiate%*, finds %3x%*
and returns the value associated with the entry for %3x%*.

.BEGIN TURN OFF "{","}";TURN ON "#";
Let's formalize our intuitions about %3value%9'%1. 
It will be a function of two arguments. The first will be a
representation of a polynomial; the second will be a representation of
the table of variables and values. 
You may have noticed that the original version of %3value%*
in fact handles expressions which are not actually constant
polynomials; %3(2#+#3)*4%* for example. Since we will wish to
apply our evaluation functions to more general classes of expressions
we will continue, indeed encourage, this deception.
Regardless of the class of expressions we wish to examine, it is
the structure of the table which should be the first order of business.
An appropriate table, %3tbl%*, will be a set of ordered pairs  
%3<name%4i%*,#val%4i%*>%1;
thus for the above example the table %3{<x,#4>,#<y,#2>,#<z,#1>}%1 would suffice.
Following our dictum of abstraction and representation-independent programming,
we will not worry about the representational problems of such tables. We will
simply assume that "tables" are instances of an abstract data structure  
called %d<table>%*, and we will only
concern ourselves for the moment with the kinds of operations we need
to perform. We will need two selector functions: %3name%*, to select
the variable-component of a table entry; and  %3val%*, to select the
value-component. A complete discussion of such a data structure would
entail discussion of constructors and recognizers, and perhaps other
functions, but for the current %3value%9'%1 these two functions will suffice.

%3value%9'%1 will need a table-function, %3locate%*, to locate an appropriate
variable-value entry. 
%3locate%* will be a binary function, taking one argument, %3x%*, 
representing a variable;
and one argument, %3tbl%*, representing a table. 
%3locate%* will match %3x%* against the
%3name%*-part of each element in %3tbl%*; 
if a match is found then the corresponding
%3val%*-part is returned. If no match is found then %3locate%* is undefined.

So far, little structure has been imposed on elements of %d<table>%*;
tables are either empty or not; but if a table is non-empty then each
element is a pair with recognizable components of name and value.
However, the specification of  algorithms to examine
elements of %d<table>%* imposes more structure on our tables.
If we were dealing with functions then a side condition to the effect
that a table had no pairs with duplicate first elements would be sufficient.
We, however, are dealing with algorithms and therefore  must describe a method for
locating elements, and soon must describe a method for adding elements.

Recursion is the only method we have for  specifying
%3locate%*, and recursion operates by decomposing a structure. Sets are
notorious for their lack of structure; there is no order to the elements of a
set. But if we are to write a LISP algorithm for %3locate%*, that algorithm will
have to be recursive on the "structure" of %3tbl%*,
and so we impose an ordering on the elements of that
table. That is, we will represent tables as %2sequences%*. We know
how to represent sequences in LISP: we use lists.
.END
.BEGIN GROUP;TURN ON "{","}";
With this introduction, here's %3locate%*⊗↓The obvious interpretation
of %3tbl%* as a function implies that %3locate%* represents function application;
i.e., %3locate[x;tbl]#%1is%*#tbl(x)%*.←:
.BEGIN TABIT1(10);SELECT 3;GROUP;
.P123:

locate[x;tbl] <=
\[eq[name[first[tbl]];x] → val[first[tbl]];
\ %et%* → locate[x;rest[tbl]] ]
.END
.END

The effect of %3locate%1 is to find the %2first%1 element of %3tbl%1 which
has a %3name%1-component which matches %3x%1.  Having found that
match, the corresponding %3val%1-part is returned. If there were other
matches further along in the sequence %3locate%1 would not see them.
Other representations of tables are certainly possible. This representation
will be useful in later applications.

.BEGIN TABIT1(10);GROUP;
And here's the new more powerful %3value%9'%1:

.SELECT 3;
value%9'%*[x;tbl] <=
\[isconstant[x] → x;
\ isvar[x] → locate[x;tbl];
\ issum[x] → +[value%9'%*[arg%41%*[x];tbl];value%9'%*[arg%42%*[x];tbl]];
\ isprod[x] → *[value%9'%*[arg%41%*[x];tbl];value%9'%*[arg%42%*[x];tbl]];
\ isexpt[x] → ↑[value%9'%*[arg%41%*[x];tbl];value%9'%*[arg%42%*[x];tbl]] ]

.END
Notice that %3tbl%* is carried through  as an explicit argument to 
%3value%9'%1 even though it is only accessed when a variable is recognized.
Notice too that much of the structure of %3value%9'%1 is quite repetitious;
the lines which handle sums, products, and exponentiation are identical
except for the function which finally gets applied to the evaluated arguments.
That is, the basic structure of %3value%9'%1 is potentially of broader application
than just the simple class of polynomials.
In keeping with our search for generality, let's pursue %3value%9'%1 a little
further.

What %3value%9'%1 says is:

%21.%* The value of a constant is that constant.

%22.%* The value of a variable is the current value associated with it in the
table.

%23.%* The value of a function applied to arguments is the result of
applying the function to the evaluated arguments. It just turns out
that the only functions %3value%9'%1 knows about are binary sums, products,
and exponentiation.

.GROUP
Let's clean up %3value%9'%1 a bit.

.BEGIN TABIT1(10);SELECT 3;GROUP;

value%9'%*[x;tbl] <=
\[isconstant[x] → x;
\ isvar[x] → locate[x;tbl];
\ isfun_args[x] → apply[fun[x];eval_args[args[x];tbl]] ]
.END
.APART

The changes are in the last branch of the conditional. We have a new
recognizer, %3isfun_args%* to recognize
function application. We have two new selector functions; %3fun%*
selects the representation of the function --#sum, product, or power
in the simple case; %3args%* selects the arguments or parameters to the
function --#in this case all functions are binary. We have two
new functions to define: %3eval_args%*, which is supposed to evaluate
the arguments using %3tbl%* to find values for any of the variables; 
and %3apply%*, which is used to perform the desired operation on the evaluated
arguments.

.BEGIN TURN OFF "}","{"; TURN ON "#";
Again we are trying to remain as representation-free as possible: thus
the generalization of the algorithm %3value%9'%1, and thus the care
in picking representations for the data structures.
We need to make another data structure decision now; when writing the 
function %3eval_args%*, we will be giving a recursive algorithm.
This algorithm will be recursive on the structure of the first argument,
which is a representation of the arguments to the function. In contrast
to our position when writing the function %3locate%*, there %2is%* a
natural structure on the arguments to a function: they form a 
sequence. That is %3f[1;2;3]%* is typically not the same as %3f[3;2;1]%*
or %3f%1 applied to any other permutation of {1,#2,#3}. 
Thus writing %3eval_args%* as a function,
recursive on the sequence-structure of its first argument, is quite
expected. Here is %3eval_args%*:
.END
.BEGIN TABIT1(10);SELECT 3;GROUP;

eval_args[args;tbl] <=
\[null[args] → ();
\ %et%* → concat[value%9'%*[first[args];tbl]; eval_args[rest[args];tbl]] ]
.END
Notice that we have written %3eval_args%* without any bias toward binary
functions; it will evaluate a sequence of arbitrary length, returning a sequence
of the evaluated arguments.

.GROUP;
There should be no real surprises in %3apply%*; it gets the representation
of the function name and the sequence of evaluated arguments and
does its job:

.BEGIN TABIT1(10);SELECT 3;GROUP;

apply[fn; evargs] <=
\[issum[fn] → +[arg%41%*[evargs];arg%42%*[evargs]];
\ isprod[fn] → *[arg%41%*[evargs];arg%42%*[evargs]];
\ isexpt[fn] → ↑[arg%41%*[evargs];arg%42%*[evargs]] ]
.END
.APART

Notice that if we should desire to recognize more functions then we need only
modify %3apply%*. That would be a short-term solution, but if we
wished to do reasonable computations  we would like a more general
function-definition facility. Such a  feature would allow new functions to
be defined during a computation; then when an application of that function
were needed, the %3value%*-function would find that definition and
apply %2it%* in a manner analogous to the way the pre-defined functions are
applied. 
How far away are we from this more desirable  super-%3value%*?
Well %3value%9'%1 is already well-endowed with a mechanism for locating
values; perhaps we can exploit this judiciously placed code.
In what context would we be interested in locating function definitions?
Here's an example:

.BEGIN tabit1(15);
.P124:

%aB%*\"What is the value of %3f[4;2;1]%* when %3f[x;y;z] <= x*y + 2*z%*?"
.END

Notice that if we have a means of recovering the definition of %3f%*,
then we can reduce the problem to %aA%* of {yon(P122)}.
We will utilize the table-mechanism, and therefore will use %3locate%*
to retrieve the definition of the function %3f%*.
In our prior applications of %3locate%* we would find a constant
as the associated value. Now, given the name %3f%*, we would expect
to find the definition of the function.
The question then, is how do we represent the definition of %3f%*?
Certainly  the body of the function, %3x*y#+#2*z%*, is one of the
necessary ingredients, but is that all? Given the expression %3x*y#+#2*z%*
can we successfully compute %3f[4;2;1]%*?
Not yet; we need to know the correspondence between the values %31,#2,#4%*
and the variables, %3x, y, z%*. That information is present in our
notation %3f[x;y;z]#<=#...%*, and is a crucial part of the definition of %3f%*.
That is, the %2order%* of the variables appearing after the function name
is an integral part of the definition:
%3f[y;z;x]#<=#x*y#+2*z%* defines a different function.

Since we are now talking about %2representations%* of functions, we
are getting into the realm of abstract data structures. We have a
reasonable understanding now of the essential components of such a representation.
For our purposes, a function has three parts:

.BEGIN INDENT 10,10;

%21.%* A name; %3f%* in the current example.

%22.%* A variable list; %3[x;y;z]%* here.

%23.%* A body; %3x*y + 2*z%* in the example.
.END

We do not need a complete study of representations of functions. For
our purposes we can assume a representation exists, and that
we are supplied with three selectors to retrieve the components mentioned
above.

.BEGIN INDENT 10,10;

%21.%* %3name%* selects the name component from the representation. We have
actually seen %3name%* before in the definition %3locate%* on {yon(P123)}.

%22.%* %3varlist%* selects the list of variables from the representation.
We have already seen that the natural way to think about this component
is as a sequence. Thus the name: %3varlist%*.

%23.%* %3body%* selects the expression which is the content of the definition.
.END

Given a function represented in the table according to these conventions, how
do we use the information to effect the evaluation of something like 
%3f[4;2;1]%*?
First %3value%9'%1 will see the representation of %3f[4;2;1]%*;
it  should recognize an instance of function-application at the following
line of %3value%9'%1:

.BEGIN CENTERIT; SELECT 3;
←isfun_args[x] → apply[fun[x];eval_args[args[x];tbl]]

.END
This should cause an  evaluation of the arguments and then
pass on the hard work to %3apply%*.

Clever %3apply%* should soon realize that %3f%* is not the name of a 
known function. It should then extract the definition of %3f%* from the
table; associate the evaluated arguments (%34,#2,#1%*) with the variables
of the variable list (%3x,#y,#z%*), making a new table with name-value pairs
(%3<x,#4>,#<y,#2>,#<z,#1>%1).
Now we are back to the setting of problem %aA%* of {yon(P122)}.
We should ask %3value%9'%1 to evaluate the %3body%*-component
of the function using the new %3tbl%*.
This works fine for %3x%1, %3y%1, and %3z%1; within the evaluation of the body
of %3f%1 we will find the right bindings for these variables. But we might 
also need some information from the original %3tbl%1. The evalaution of the
body of %3f%1 might entail the application of some function definition
present in %3tbl%1. For example, the representation of 

.BEGIN CENTERIT;
←"what is %3g[2]%1 where %3g[x] <= x+s[x];%1 and %3s[x] <= x*x%1?"
.END
Within the body of %3g%1 we need the definition of %3s%1. Therefore,
instead of building a new table we will add the new bindings to the front of the
old table. Since %3locate%1 begins its search from the front of the table
we will be assured of finding the new bindings; since the old table is still
accessible we are assured of finding any necessary previous bindings.

Now we should be able to go off and create a new %3value%9''%1.
Looking at the finer detail of %3value%9'%1 and %3apply%*,
we can see a few other modifications need to be made.
%3apply%9'%1 will   to %3locate%* the function
definition and thus %3tbl%* should be included as a third argument to 
%3apply%9'%1. That is, inside %3apply%9'%1 we will have:

.BEGIN CENTERIT;

←%3isfun[fn] → apply%9'%3[locate[fn;tbl];eargs;tbl]%1;
.END

After %3locate%1 has done its work,
this line (above) will invoke %3apply%9'%1 with a function definition as first
argument. We'd better prepare  %3apply%9'%1 for such an eventuality with the
following addition:
.BEGIN CENTERIT; SELECT 3;

←isdef[fn] → value%9''%*[body[fn];newtbl[varlist[fn];eargs;tbl]];
.END

What does this incredible line say? It says 
.BEGIN INDENT 5,5;
"Evaluate the body of the function using a new table manufactured from
the old table by adding the pairings of the elements of the variable 
list with the evaluated arguments."
.END
It also says we'd better write %3newtbl%*. This function will be making
a new table by adding new name-value pairs to an existing table.
So we'd better name a constructor
to generate a new name-value pair:
.BEGIN INDENT 0,5;

%3mkent%* is the constructor to make new entries. It will take two arguments:
the first will be the name, the second will be the value.
.END

Since we have assumed that the structure of tables, variable-lists, and
calling sequences to functions are %2all%* sequences, we will
write %3newtbl%* assuming this representation.

.BEGIN TABIT1(10);GROUP;SELECT 3;

newtbl[vars;vals;tbl] <=
\[null[vars] → tbl;
\ %et%* → concat[mkent[first[vars];first[vals]];newtbl[rest[vars];rest[vals];tbl]] ]
.END

And finally here's the new %3value%9''%*-apply%9'%1 pair:
.P125:

.BEGIN TABIT1(10);SELECT 3;GROUP;

value%9''%*[x;tbl] <=
\[isconstant[x] → x;
\ isvar[x] → locate[x;tbl];
\ isfun_args[x] → apply%9'%3[fun[x];eval_args[args[x];tbl];tbl] ]
.END
.BEGIN TABIT1(10);SELECT 3;GROUP;

apply%9'%3[fn;evargs;tbl] <=
\[issum[fn] → +[arg%41%*[evargs];arg%42%*[evargs]];
\ isprod[fn] → *[arg%41%*[evargs];arg%42%*[evargs]];
\ isexpt[fn] → ↑[arg%41%*[evargs];arg%42%*[evargs]];
\ isfun[fn] → apply%9'%*[locate[fn;tbl];evargs;tbl];
\ isdef[fn] → value%9''%*[body[fn];newtbl[varlist[fn];evargs;tbl]] ]
.END
.BEGIN TABIT1(10);SELECT 3;GROUP;

eval_args[args;tbl] <=
\[null[args] → ()
\ %et%* → concat[value%9'%*[first[args];tbl]; eval_args[rest[args];tbl]] ]
.END

Let's go through a complete massaging of %aB%* of {yon(P124)}.
As before we will use %eR%* as a mapping from expressions to representations.
Thus we want to pursue: 
.BEGIN TURN ON "←";TURN OFF "{","}";
←%3value%9''%*[%eR%f(%3 f[4;2;1] %f)%*; %eR%f(%3{ <f, [[x;y;z] x*y + 2*z]> }%f)%3]%1.

Before we begin, let us denote the initial symbol table, 
%eR%f(%3{#<f,#[[x;y;z]#x*y#+#2*z]>#}%f)%1 as %3init%1. This will simplify many 
of the equations.
Notice that
our representation of %3f%* in %3init%1 has associated the variable list %3[x;y;z]%1
with the body of the function. Thus %3locate%1, operating on this table with the
name %3f%1, will return a representation of %3[[x;y;z]#x*y#+#2*z]%1.

%3isfun_args%* should be satisfied and thus the computation should reduce to:

←%3apply%9'%3[fun[
%eR%f(%3 f[4;2;1] %f)%3
];eval_args[args[
%eR%f(%3 f[4;2;1] %f)%3
];
init
]%1 or:

←%3apply%9'%3[
%eR%f(%3 f %f)%3
;eval_args[
%eR%f(%3 [4;2;1] %f)%3;
init];
init
]

%3eval_args%1  will build a sequence of the evaluated arguments: %3(4,#2,#1)%1,
resulting in:

←%3apply%9'%3[
%eR%f(%3 f %f)%3
;(4, 2, 1)
;
init
]

%3apply%9'%1 should decide that %3f%* satisfies %3isfun%* giving:

←%3apply%9'%*[
value%9''%*[
%eR%f(%3 f %f)%3
;
init
];
(4, 2, 1)
;
init
]%1


The computation of:
value%9''%*[
%eR%f(%3 f %f)%3
;
init
]%1 is interesting:
%3value%1 should believe that %3f%* is a function name  and
therefore %3locate%* will be called and retrieve the definition.

←%3apply%9'%*[
%eR%f(%3  [[x;y;z] x*y + 2*z] %f)%3
;
(4, 2, 1)
;
init
]%1 should be the result.

This first argument should not make %3apply%9'%1 unhappy. It should
realize that %3isdef%* is satisfied, and thus:

←%3value%9''%*[body[
%eR%f(%3  [[x;y;z] x*y + 2*z] %f)%3
];newtbl[varlist[
%eR%f(%3  [[x;y;z] x*y + 2*z] %f)%3
];(4,2,1);init]]%1 or:

←%3value%9''%*[
%eR%f(%3  [x*y + 2*z] %f)%3
;newtbl[
%eR%f(%3  [x;y;z]  %f)%3
;(4,2,1);init]]%1

after %3body%* and %3varlist%* are finished.
But 
%eR%f(%3#[x;y;z]#%f)%1 is 
%3(%eR%f(%3#x#%f)%3,#%eR%f(%3#y#%f)%3,#%eR%f(%3#z#%f)%3)%1, 
and therefore the computation of
%3newtbl%*  will build a new table with entries for %3x,#y,#%1#and#%3z%1 on
the front:

←%eR%f(%3{ <x, 4>, <y, 2>, <z, 1>, <f, [[x;y;z] x*y + 2*z]> }%f)%1.

Thus we call %3value%9''%1 with:

←%3value%9''%*[
%eR%f(%3  [x*y + 2*z] %f)%3;
%eR%f(%3{ <x, 4>, <y, 2>, <z, 1>, <f, [[x;y;z] x*y + 2*z]> }%f)%3
]%1.


.END
Now we're  back at problem %aA%* of {yon(P122)}.
.CENT(Time to take stock)
We have written a reasonably sophisticated algorithm here. It covers
evaluation of a large class of arithmetic functions. We should 
examine the results quite carefully.

First notice that we have written the algorithm with almost no
concern for representation. We %2assume%* that representations
are available for such varied things as arithmetic expressions,
tables, calls on functions, and even function definitions. Very
seldom did we commit ourselves to anything close to a concrete
representation, and only then with great reluctance. It was
with some sadness that we imposed a sequencing on elements of
tables.  
Variable lists and calling sequences were not as traumatic;
we claimed their natural structure was a sequence. As always,
if we wish to run these programs on a machine we must supply some
representations, but even then the representations will only
interface with our algorithms at the constructors, selectors and recognizers.

We have made some more serious representational decisions in
the structure of the %2algorithm%*.  We have encoded a version
of the %aCBV%*-scheme of {yon(P121)}. We have seen what kinds of
difficulties %2that %* can get us into. We will spend a 
large amount of time in {yonsec(P113)} discussing the problems
of evaluation⊗↓A second decision was implied in our handling of function
definitions; namely we bound the function name to a structure consisting
of the variable list and the function body. This representation 
gives the expected result in most cases, but involves one of the
more problematic areas of programming languages: how do you
find the bindings of variables which do not appear in the current variable
list? Function names belong in this category. Such variables are called
free variables. The scheme proposed in this section finds the
binding which is current when the function was applied. The
programming language, ALGOL, advocates finding the binding which was current
when the function was defined.←.

.P186:
Finally, our decisions on the data structures and the algorithms were not
made independently. For example, there is strong interaction between 
our representation
of tables and the algorithms, %3locate%1 and %3newtbl%1 which  manipulate
them. We should ask how much of this interaction is inherent 
and how much is gratuitous. For example we have remarked that our representation
can have pairs with duplicate first elements. It is the responsibility of %3locate%1
to see that we find the expected pair. If we wrote %3locate%1 to search from right
to left, we could get the wrong pair. 
We %2could%1 write %3newtbl%1 to be more selective; it could manufacture
a table without such duplications:

.BEGIN TABIT3(10,19,28);GROUP;SELECT 3;

newtbl[vars;vals;tbl] <=  
\[null[tbl] → \[null[vars] → ();
\\ %et%3 → concat\[mkent[first[vars];first[vals]];
\\\ newtbl[rest[vars];rest[vals];( )]]
\ member[name[first[tbl]];vars] → newtbl[vars;vals;rest[tbl]]
\ %et%* → concat[first[tbl];newtbl[vars;vals;rest[tbl]]] ]
.END

This version of %3newtbl%1 requires much more computation than the alternative.
Its advantage is that the "set"-ness of symbol tables is  maintained. Luckily
the "set" property is one which we need not depend on for our algorithms. Indeed
we will frequently depend on the obverse property: that the previous values
of variables can be found further along in the sequence.


The main point of this example however is to impress on you
the importance of writing at a sufficiently high level
of abstraction. We have produced a non-trivial algorithm
which is clear and concise. If it were desirable to 
have this algorithm running on a machine we could code it and
its associated data structure representations in a very short time.
In a very short time %2we%* will be able to run this algorithm
on a LISP machine.


.<<AND THE GREAT MOTHERS>>
.REQUIRE "PROB6.PUB"SOURCE_FILE;
.SS(Another Respite)

We have again reached a point where a certain amount of reflection would be
good for the soul.  
Though this is not a programming manual we would be remiss if we did not
attempt to analyze the style which we have tried to exercise when writing
programs. 

1. Write the algorithm in an abstract setting; do not muddle the abstract
algorithm with the chosen representation.  If you follow this dictum your
LISP programs will never use %3car, cdr, cons%1, and %3atom%*.
All instances of these LISP primitives will be banished to small subfunctions
which manipulate representations. 

2. When writing the abstract program, do not be afraid to  cast off
difficult parts of the implementation to subfunctions. Remember that if
you have trouble keeping the details in mind when %2writing%* the program,
then the confusion involved in %2reading%* the program at some later time
will be overwhelming. Once you have convinced yourself of the
correctness of the current composition, then worry about the construction
of the subfunctions. Seldom does the process of composing a program flow
so gently from top-level to specific representation. 
Only the toy programs are easy; the construction of the practical
program will be confusing, and will require much rethinking.
But bring as much structure as you can to the process.

3. From the other side of the question, don't be afraid to look at
specific implementations, or specific data-structure representations before
you begin to write. There is something quite comforting about a "real"
data structure. Essentially data structures are static objects
⊗↓At least within the program presently being constructed.←, while
programs are dynamic objects. A close look at a possible representation
may get you a starting point and as you write the program it will become
clear when you are depending on the specific representation and when
you are just using properties of an abstract data structure.

Perhaps the more practical reader is ovecome by the inefficiencies inherent
in these proposals. Two answers: first, "inefficiency" is a very etherial concept.
Like "structured programming", it is difficult to define but recognizable
when it occurs.
Hardware and software development has been such that last year's "inefficiency"
is this year's %3passe%*  programming trick. 
But even at a more topical level, much of what seems inefficent can now be
straightened out by a compiler (see {yonsec(P107)}). 
Frequently compilers can do very clever optimizations to generate efficient code.
It is better to leave the cleverness to the compiler, and the clarity to the 
programmer.

Second, the problems in programming are not those of efficiency. They are
problems of %2correctness%*. How do you write a program which works?
Until practical  tools are developed for proving correctness it is up
to the programmer to certify his programs. Any methodology which  can
aid the programmer will be most welcome.
Clearly, the closer you can write the program to your intuition, the
less chance there is for error. This was one of the reasons for developing
high-level languages. But do not forget that the original motivation for
such languages was a convenient notation for expressing numerical problems,
that is, for writing programs to express ideas which have already had their
juices extracted as mathematical formulas.
With data structures, we are able to formalize a broader range of domains, 
expressing our ideas as data structure manipulations rather
than as numerical relationships.

What kinds of errors are prevelant in data structure programming? At least
two kinds: errors of omission#--#misunderstanding of the basic algorithm; and
errors of commission#--#errors due to misapplied cleverness in attempting
to be efficient.

The occurrences of errors of omission can be lessened 
by  presenting the user with
programming constructs which are close to the intuited algorithm.
Such constructs include control structures, data structures,
and  representations for operations.

Errors of commission comprise
the great majority of the present day headaches. It is here that programming
%2style%* can be  beneficial: keep the representation of the data structures 
away from the description of the
algorithm; write concise abstract programs, passing off responsibilities to
subfunctions. Whenever a definition of "structured programming" is arrived at,
this advice on programming style should be included.

Before closing this discussion on the philosphy of LISP programming, we
can't but note that the preceding section, %2The Great Mothers%*, 
completely ignored our good advice. This would be a good time for the
interested reader to abstract the %3tgmoaf%* algorithm from the
particular data representation. This detective work will be most
rewarding.

.CENT(Problems)

I. Write an abstract version of %3tgmoaf%*.
.<<PROVING PROPERTIES>>
.NEXT PAGE;
.SS(Proving properties of programs,,P15:)
.BEGIN TURN ON "←","#";
People are becoming increasingly aware of the importance of giving 
convincing
arguments for such things as the correctness or equivalence of programs. These are
both very difficult enterprises. We will only sketch a proof
of a simple property of two programs and leave others as problems for the 
interested reader.
How do you go about proving properties of programs?
In {yonss(P120)} we noted certain benefits of defining sets
using inductive definitions. First, there was a natural way of thinking about
the construction of an algorithm over that set. We have exploited that 
observation in our  study of LISP programming. What we need recall here is 
the observation that inductive style proofs (see %aPRF%* on {yon(P119)}) 
are valid forms of reasoning
over such domains. Since we in fact defined our data structure domains in
an inductive manner, it seems natural to look for inductive arguments
when proving properties of programs. This is indeed what we do; we perform
induction on the structure of the elements in the data domain.


For example,
using the definition of %3append%* given on {yon(P22)} and the definition
of %3reverse%* given on {yon(P23)}, we wish to show that:

←%3append[reverse[y];reverse[x]] = reverse[append[x;y]]%*,

for any lists, %3x%*, and %3y%*. The induction will be on the structure of %3x%*.

.BEGIN TABIT3(5,10,15);
.GROUP
\%2Basis%*: %3x%* is %3( )%*.
\We must thus show: %3append[reverse[y];( )] = reverse[append[( );y]]%*
\But: %3reverse[append[( );y]] = reverse[y]%*  by the def. of %3append%*.
\We now establish the stronger result:  %3append[z;( )] = z%* ⊗↓In the following proof
several intermediate steps have been ommitted.←
\\%2Basis:%* %3z%* is %3( )%*.
\\Show %3append[( );( )] = ( )%*. Easy.
\\%2Induction step:%* Assume the lemma for lists, %3z%*, of length n;
\\Prove: %3append[concat[x;z];( )] = concat[x;z]%*.
\\Since %3concat[...]%* is not %3( )%*, then applying the definition of %3append%*
\\says we must prove: %3concat[x;append[z;( )]] = concat[x;z]%*.
\\But our induction hypothesis is applicable since %3z%1 is shorter than %3concat[x;z]%1.
\\Our result follows.
\So the Basis for our main result is established.
.APART
.GROUP
\%2Induction step:%* Assume the result for lists, %3z%*, of length n;
\Prove:
\←   (1)   %3append[reverse[y];reverse[concat[x;z]]] = reverse[append[concat[x;z];y]]%*
\ Using the definition of %3reverse%* on the LHS of (1) gives:
\←   (2)   %3append[reverse[y];append[reverse[z];list[x]]]%*.
\ Using the definition of %3append%* on the RHS of (1) gives:
\←   (3)   %3reverse[concat[x;append[z;y]]].%*
\ Using the definition of %3reverse%* on (3) gives:
\←   (4)   %3append[reverse[append[z;y]];list[x]].%*
\ Using our induction hypothesis on (4) gives:
\←   (5)   %3append[append[reverse[y];reverse[z]];list[x]]%*
\ Thus we must establish that    (2) = (5).
\ But this is just an instance of the associativity of %3append%*:
\←%3append[x;append[y;z]] = append[append[x;y];z].%*  (See problem I, below.)
.END

The structure of the proof is analogous to proofs by mathematical 
induction in elementary number theory. The ability to perform such proofs
is a direct consequence of our careful definition of data structures.
Examination of the proof will show that there is a close relationship
between what we are inducting on in the proof and what  we are recurring on
during the evaluation of the  expressions. A program
written by Boyer and Moore has been reasonably successful in generating
proofs like the above by exploiting this relationship.


.CENT(Problems)

I. Prove the associativity of %3append%*.

II Analysis of the above proof shows frequent use of other results for LISP
functions. Fill in the details. Investigate the possibility of formalizing
this proof, showing what axioms are needed.

III Show the equivalence of %3fact%* ({yon(P20)}) and %3fact%41%1 ({yon(P21)}).

IV Show the equivalence of %3length%* and %3length%41%1 ({yon(P19)}).

V  Using the definition of %3reverse%*, given on {yon(P16)}, prove:

←%3reverse[reverse[x]] = x%*.

.END